Discussion:
ZFS read errors
(too old to reply)
steven
2011-08-10 08:59:21 UTC
Permalink
Hello
I am having problems with my ZFS, I have put in a LSI 3041E-S controller and have 2 disks on it, and a further 4 on the motherboard. I am getting read errors on the pool but not on any disk? Any idea where I should look to find the problem?
Thanks
Steven
\> uname -a
SunOS XXXXX.XXXX.com 5.11 snv_151a i86pc i386 i86pc

\> zpool status –v
pool: rz2pool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub in progress since Wed Aug 10 09:13:12 2011
78.5G scanned out of 720G at 74.0M/s, 2h28m to go
0 repaired, 10.89% done
config:

NAME STATE READ WRITE CKSUM
rz2pool ONLINE [b]353 [/b] 0 0
raidz2-0 ONLINE 0 0 0
c8t2d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c8t4d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
c9t1d0 ONLINE 0 0 0
c9t2d0 ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

rz2pool/datastore:<0x1>
rz2pool/datastore:<0xfffffffffffffffe>
rz2pool/datastore:<0xffffffffffffffff>
<0x49>:<0x1>
--
This message posted from opensolaris.org
steven
2011-08-10 12:02:42 UTC
Permalink
Also should be getting Illegal Request errors? (no hard or soft errors)

Some more info: (I am doing a Scrub hence the high blocking levels)

var/log$ iostat -Ex
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
sd0 1.1 16.9 30.6 463.0 0.2 0.0 10.6 1 1
sd1 1.0 16.9 30.3 463.0 0.2 0.0 13.5 2 2
sd2 208.7 4.8 14493.8 16.0 3.0 0.5 16.5 45 48
sd3 212.4 4.8 14493.4 16.0 2.6 0.4 13.8 41 44
sd4 221.9 4.8 14491.9 16.0 0.0 1.8 8.1 0 46
sd5 212.3 4.8 14493.5 16.0 2.5 0.4 13.4 41 44
sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd8 231.7 4.8 14692.7 16.3 0.0 1.8 7.5 0 42
sd9 239.9 4.7 14691.7 16.3 0.0 1.2 5.1 0 36
sd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD2500SD-01K Revision: 2D08 Serial No:
Size: 250.06GB <250059350016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 3 Predictive Failure Analysis: 0
sd1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD2500JD-75G Revision: 5D02 Serial No:
Size: 250.00GB <250000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 4 Predictive Failure Analysis: 0
sd2 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD3200SD-01K Revision: 5J08 Serial No:
Size: 320.07GB <320072933376 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd3 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD2500SD-01K Revision: 2D08 Serial No:
Size: 250.06GB <250059350016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd4 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD5000YS-01M Revision: 2E07 Serial No:
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd5 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD2500SD-01K Revision: 2D08 Serial No:
Size: 250.06GB <250059350016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd6 Soft Errors: 0 Hard Errors: 5 Transport Errors: 0
Vendor: CREATIVE Product: DVD-ROM DVD1243E Revision: IC01 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 5 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd8 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD5000AACS-0 Revision: 1B01 Serial No:
Size: 500.11GB <500106780160 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 7 Predictive Failure Analysis: 0
sd9 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: ST31000340AS Revision: SD15 Serial No:
Size: 1000.20GB <1000204886016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 7 Predictive Failure Analysis: 0
/var/log$
--
This message posted from opensolaris.org
Bob Friesenhahn
2011-08-10 19:30:32 UTC
Permalink
Post by steven
Also should be getting Illegal Request errors? (no hard or soft errors)
Illegal Request sounds like the OS is making a request that drive
firmware does not support. It is also possible that the request
became corrupted due to interface issue.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Roy Sigurd Karlsbakk
2011-08-10 19:47:27 UTC
Permalink
What sort of controller/backplane/etc are you using? I've seen similar iostat output with western drives on a supermicro SAS expander

roy

----- Original Message -----
Post by steven
Also should be getting Illegal Request errors? (no hard or soft errors)
Some more info: (I am doing a Scrub hence the high blocking levels)
var/log$ iostat -Ex
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
sd0 1.1 16.9 30.6 463.0 0.2 0.0 10.6 1 1
sd1 1.0 16.9 30.3 463.0 0.2 0.0 13.5 2 2
sd2 208.7 4.8 14493.8 16.0 3.0 0.5 16.5 45 48
sd3 212.4 4.8 14493.4 16.0 2.6 0.4 13.8 41 44
sd4 221.9 4.8 14491.9 16.0 0.0 1.8 8.1 0 46
sd5 212.3 4.8 14493.5 16.0 2.5 0.4 13.4 41 44
sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd8 231.7 4.8 14692.7 16.3 0.0 1.8 7.5 0 42
sd9 239.9 4.7 14691.7 16.3 0.0 1.2 5.1 0 36
sd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 250.06GB <250059350016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 3 Predictive Failure Analysis: 0
sd1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 250.00GB <250000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 4 Predictive Failure Analysis: 0
sd2 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 320.07GB <320072933376 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd3 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 250.06GB <250059350016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd4 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd5 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 250.06GB <250059350016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
sd6 Soft Errors: 0 Hard Errors: 5 Transport Errors: 0
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 5 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd8 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 500.11GB <500106780160 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 7 Predictive Failure Analysis: 0
sd9 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Size: 1000.20GB <1000204886016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 7 Predictive Failure Analysis: 0
/var/log$
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
***@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
steven
2011-08-10 21:00:20 UTC
Permalink
Hello,

Thanks for the reply. I used to use the onboard SATA ports (intel DQ965DF) but i have added a LSI-3041E-S controller and this is where i get the problems.
The controller is a Sun version (note the S not R in model) but it is a PC version and I tried flashing it with the firmware from both LSI and SUN (I still get the errors).
Pity as I had hoped the controllers would solve my expansion problems.
Also I am mixing SATA and SATA II drives, but have tried seting the drive jumpers to SATA I mode.

The LSI card wont let me Ctrl-C at the BIOS to get into the config utility , but i believe this is normal on Intel Mboards. I am not after the RAID capabilities.

Thanks for the help, let me know if there are any ideas how to get this card working (SUN Part number SG-XPCIE4SAS3-Z.)

Regards
Steven
--
This message posted from opensolaris.org
steven
2011-08-12 07:25:06 UTC
Permalink
Hello

Well I have got to the bottom of it (sort of).
I have a shared IRQ - but this is not the problem
The controller is working, either with the LSI firmware 1.26 or the Sun firmware 1.28. (no setup , no config , it just works)
All the drives are healthy, they are an odd mix of sizes and types - one is a 'green' drive. (Thanks to Roy for pointing me in this direction)

So I rebuilt the system as it used to be (working fine) and I get the same errors. Which leaves the only other change I made as the problem , - I turned on encryption!

Turn it off and all works well.

I still get the Illegal request errors (but the count is low):
(12 hrs, 600gb data write , constant scrub)
:~$ iostat -E | grep Illegal
Illegal Request: 12 Predictive Failure Analysis: 0
Illegal Request: 11 Predictive Failure Analysis: 0
Illegal Request: 10 Predictive Failure Analysis: 0
Illegal Request: 10 Predictive Failure Analysis: 0
Illegal Request: 10 Predictive Failure Analysis: 0
Illegal Request: 10 Predictive Failure Analysis: 0
Illegal Request: 0 Predictive Failure Analysis: 0
Illegal Request: 9 Predictive Failure Analysis: 0
Illegal Request: 9 Predictive Failure Analysis: 0
$

I rebuilt the system with the new controller , then new (additional) drives and loaded it with data and scrubbed it (many times) - no errors.
(I used to get errors after a reboot with very little data, guaranteed. )


I am not sure why encryption is a problem but i suspect it has to do with the drive mix. Anyone else had problems with encryption?

As I understand it , it is just like having another filter between your data and disk so why it would create errors i have no idea.

I did wonder if encryption=on is just showing me errors that are there anyway but I have done many a read/write/scrub/reboot and I dont get a single error. The Illegal request count is very much lower so I wonder if my controllers cannot deal with some command specific to encryption.

Which brings me back to the mix of drives, I did try creating a zfs vol for each disk (encrypted) to see if it would highlight a problematic drive but they all gave errors (using both controllers).

Solution: Turn encryption off (pity as I wanted to give it a go)


Regards
Steven
--
This message posted from opensolaris.org
Continue reading on narkive:
Loading...