Discussion:
mpt_sas multipath problem?
(too old to reply)
Marion Hakanson
2013-01-07 21:20:08 UTC
Permalink
Raw Message
Greetings,

We're trying out a new JBOD here. Multipath (mpxio) is not working,
and we could use some feedback and/or troubleshooting advice.

The OS is oi151a7, running on an existing server with a 54TB pool
of internal drives. I believe the server hardware is not relevant
to the JBOD issue, although the internal drives do appear to the
OS with multipath device names (despite the fact that these
internal drives are cabled up in a single-path configuration). If
anything, this does confirm that multipath is enabled in mpt_sas.conf
via the mpxio-disable="no" directive (internal HBA's are LSI SAS,
2x 9201-16i and 1x 9211-8i).

The JBOD is a SuperMicro 847E26-RJBOD1, with the front backplane
daisy-chained to the rear backplane (both expanders). Each of the two
expander chains is connected to one port of an LSI SAS 9200-8e HBA. So
far, all this hardware has appeared as working for others and well-supported,
and this 9200-8e is running the -IT firmware, version 15.0.0.0.

The drives are 40x of the WD4001FYYG SAS 4TB variety, firmware VR02.
The spot-checks I've done so far seem to show that both device instances
of a drive show up in "prtconf -Dv" with identical serial numbers and
identical "devid" and "guid" values, so I'm not sure what might be
missing to allow mpxio to recognize them as the same device.

Has anyone out there got this type of hardware working? In a multipath
configuration? Suggestions on mdb or dtrace code I can use to debug?
Are there "secrets" to the internal daisy-chain cabling that our vendor
is not aware of?

Thanks and regards,

Marion
Richard Elling
2013-01-07 22:42:44 UTC
Permalink
Raw Message
Post by Marion Hakanson
Greetings,
We're trying out a new JBOD here. Multipath (mpxio) is not working,
and we could use some feedback and/or troubleshooting advice.
Sometimes the mpxio detection doesn't work properly. You can try to
whitelist them,
https://www.illumos.org/issues/644

-- richard
Post by Marion Hakanson
The OS is oi151a7, running on an existing server with a 54TB pool
of internal drives. I believe the server hardware is not relevant
to the JBOD issue, although the internal drives do appear to the
OS with multipath device names (despite the fact that these
internal drives are cabled up in a single-path configuration). If
anything, this does confirm that multipath is enabled in mpt_sas.conf
via the mpxio-disable="no" directive (internal HBA's are LSI SAS,
2x 9201-16i and 1x 9211-8i).
The JBOD is a SuperMicro 847E26-RJBOD1, with the front backplane
daisy-chained to the rear backplane (both expanders). Each of the two
expander chains is connected to one port of an LSI SAS 9200-8e HBA. So
far, all this hardware has appeared as working for others and well-supported,
and this 9200-8e is running the -IT firmware, version 15.0.0.0.
The drives are 40x of the WD4001FYYG SAS 4TB variety, firmware VR02.
The spot-checks I've done so far seem to show that both device instances
of a drive show up in "prtconf -Dv" with identical serial numbers and
identical "devid" and "guid" values, so I'm not sure what might be
missing to allow mpxio to recognize them as the same device.
Has anyone out there got this type of hardware working? In a multipath
configuration? Suggestions on mdb or dtrace code I can use to debug?
Are there "secrets" to the internal daisy-chain cabling that our vendor
is not aware of?
Thanks and regards,
Marion
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--

***@RichardElling.com
+1-760-896-4422
James C. McPherson
2013-01-08 01:07:06 UTC
Permalink
Raw Message
Post by Marion Hakanson
Greetings,
We're trying out a new JBOD here. Multipath (mpxio) is not working,
and we could use some feedback and/or troubleshooting advice.
Output from 'prtconf -v' would help, as would a cogent
description of what you are looking at to determine that
MPxIO isn't working.



James C. McPherson
--
Oracle
Systems / Solaris / Core
http://www.jmcpdotcom.com/blog
Marion Hakanson
2013-01-08 03:33:12 UTC
Permalink
Raw Message
Sometimes the mpxio detection doesn't work properly. You can try to whitelist
them, https://www.illumos.org/issues/644
Thanks Richard, I was hoping I hadn't just made up my vague memory of such
functionality. We'll give it a try.
That did the trick. I added these lines to /kernel/drv/scsi_vhci.conf,
at the end of the file:

scsi-vhci-failover-override =
"WD WD4001FYYG-01SL3", "f_sym"; # WD RE 4TB SAS HDD

A reboot was involved, as I wasn't able to coax the system into re-reading
the scsi_vhci.conf file using "update_drv scsi_vhci", nor by unplugging
and replugging the JBOD's SAS cables, "cfgadm -c unconfigure c49", etc.

I'm off to exercise it with filebench tomorrow....

Thanks and regards,

Marion
Marion Hakanson
2013-01-08 03:39:48 UTC
Permalink
Raw Message
Output from 'prtconf -v' would help, as would a cogent description of what
you are looking at to determine that MPxIO isn't working.
Sorry James, I must've made a cut-and-paste-o and left out my description
of the symptom. That being, 40 new drives show up as 80 new disk devices
at the OS level (in "format", in "cfgadm -alv", in "ls /dev/dsk" and in
"prtconf -Dv" listings).

Adding the drives' string to a white-list in scsi_vhci.conf got us going,
thanks to Richard's reminder. I do have before and after prtconf listings,
if anyone is interested.

Regards,

Marion

Loading...