Discussion:
Lots of overhead with ZFS - what am I doing wrong?
(too old to reply)
Marko Milisavljevic
2007-05-14 07:53:10 UTC
Permalink
I was trying to simply test bandwidth that Solaris/ZFS (Nevada b63) can deliver from a drive, and doing this:
dd if=(raw disk) of=/dev/null gives me around 80MB/s, while dd if=(file on ZFS) of=/dev/null gives me only 35MB/s!?. I am getting basically the same result whether it is single zfs drive, mirror or a stripe (I am testing with two Seagate 7200.10 320G drives hanging off the same interface card).

On the test machine I also have an old disk with UFS on PATA interface (Seagate 7200.7 120G). dd from raw disk gives 58MB/s and dd from file on UFS gives 45MB/s - far less relative slowdown compared to raw disk.

This is just an AthlonXP 2500+ with 32bit PCI SATA sil3114 card, but nonetheless, the hardware has the bandwidth to fully saturate the hard drive, as seen by dd from the raw disk device. What is going on? Am I doing something wrong or is ZFS just not designed to be used on humble hardware?

My goal is to have it go fast enough to saturate gigabit ethernet - around 75MB/s. I don't plan on replacing hardware - after all, Linux with RAID10 gives me this already. I was hoping to switch to Solaris/ZFS to get checksums (which wouldn't seem to account for slowness, because CPU stays under 25% during all this).

I can temporarily scrape together an x64 machine with ICH7 SATA interface - I'll try the same test with same drives on that to elliminate 32-bitness and PCI slowness from the equation. And while someone will say dd has little to do with real-life file server performance - it actually has a lot to do with it, because most of use of this server is to copy multi-gigabyte files to and fro a few times per day. Hardly any random access involved (fragmentation aside).


This message posted from opensolaris.org
Al Hopper
2007-05-14 13:11:34 UTC
Permalink
On Mon, 14 May 2007, Marko Milisavljevic wrote:

[ ... reformatted ....]
Post by Marko Milisavljevic
I was trying to simply test bandwidth that Solaris/ZFS (Nevada b63) can
deliver from a drive, and doing this: dd if=(raw disk) of=/dev/null
gives me around 80MB/s, while dd if=(file on ZFS) of=/dev/null gives me
only 35MB/s!?. I am getting basically the same result whether it is
single zfs drive, mirror or a stripe (I am testing with two Seagate
7200.10 320G drives hanging off the same interface card).
Which interface card?

... snip ....

Al Hopper Logical Approach Inc, Plano, TX. ***@logical-approach.com
Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Richard Elling
2007-05-14 15:57:19 UTC
Permalink
Post by Marko Milisavljevic
dd if=(raw disk) of=/dev/null gives me around 80MB/s, while dd if=(file on ZFS) of=/dev/null gives me only 35MB/s!?. I am getting basically the same result whether it is single zfs drive, mirror or a stripe (I am testing with two Seagate 7200.10 320G drives hanging off the same interface card).
Checksum is a contributor. AthlonXPs are long in the tooth. Disable checksum and experiment.
-- richard
Post by Marko Milisavljevic
On the test machine I also have an old disk with UFS on PATA interface (Seagate 7200.7 120G). dd from raw disk gives 58MB/s and dd from file on UFS gives 45MB/s - far less relative slowdown compared to raw disk.
This is just an AthlonXP 2500+ with 32bit PCI SATA sil3114 card, but nonetheless, the hardware has the bandwidth to fully saturate the hard drive, as seen by dd from the raw disk device. What is going on? Am I doing something wrong or is ZFS just not designed to be used on humble hardware?
My goal is to have it go fast enough to saturate gigabit ethernet - around 75MB/s. I don't plan on replacing hardware - after all, Linux with RAID10 gives me this already. I was hoping to switch to Solaris/ZFS to get checksums (which wouldn't seem to account for slowness, because CPU stays under 25% during all this).
I can temporarily scrape together an x64 machine with ICH7 SATA interface - I'll try the same test with same drives on that to elliminate 32-bitness and PCI slowness from the equation. And while someone will say dd has little to do with real-life file server performance - it actually has a lot to do with it, because most of use of this server is to copy multi-gigabyte files to and fro a few times per day. Hardly any random access involved (fragmentation aside).
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Marko Milisavljevic
2007-05-14 20:02:26 UTC
Permalink
To reply to my own message.... this article offers lots of insight into why dd access directly through raw disk is fast, while accessing a file through the file system may be slow.

http://www.informit.com/articles/printerfriendly.asp?p=606585&rl=1

So, I guess what I'm wondering now is, does it happen to everyone that ZFS is under half the speed of raw disk access? What speeds are other people getting trying to dd a file through zfs file system? Something like

dd if=/pool/mount/file of=/dev/null bs=128k (assuming you are using default ZFS block size)

how does that compare to:

dd if=/dev/dsk/diskinzpool of=/dev/null bs=128k count=10000

If you could please post your MB/s and show output of zpool status so we can see your disk configuration I would appreciate it. Please use file that is 100MB or more - result is be too random with small files. Also make sure zfs is not caching the file already!

What I am seeing is that ZFS performance for sequential access is about 45% of raw disk access, while UFS (as well as ext3 on Linux) is around 70%. For workload consisting mostly of reading large files sequentially, it would seem then that ZFS is the wrong tool performance-wise. But, it could be just my setup, so I would appreciate more data points.


This message posted from opensolaris.org
j***@sun.com
2007-05-14 20:43:31 UTC
Permalink
This certainly isn't the case on my machine.

$ /usr/bin/time dd if=/test/filebench/largefile2 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 1.3
user 0.0
sys 1.2

# /usr/bin/time dd if=/dev/dsk/c0t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 22.3
user 0.0
sys 2.2

This looks like 56 MB/s on the /dev/dsk and 961 MB/s on the pool.

My pool is configured into a 46 disk RAID-0 stripe. I'm going to omit
the zpool status output for the sake of brevity.
Post by Marko Milisavljevic
What I am seeing is that ZFS performance for sequential access is
about 45% of raw disk access, while UFS (as well as ext3 on Linux) is
around 70%. For workload consisting mostly of reading large files
sequentially, it would seem then that ZFS is the wrong tool
performance-wise. But, it could be just my setup, so I would
appreciate more data points.
This isn't what we've observed in much of our performance testing.
It may be a problem with your config, although I'm not an expert on
storage configurations. Would you mind providing more details about
your controller, disks, and machine setup?

-j
Marko Milisavljevic
2007-05-14 21:41:42 UTC
Permalink
Thank you for those numbers.

I should have mentioned that I was mostly interested in single disk or small
array performance, as it is not possible for dd to meaningfully access
multiple-disk configurations without going through the file system. I find
it curious that there is such a large slowdown by going through file system
(with single drive configuration), especially compared to UFS or ext3.

I simply have a small SOHO server and I am trying to evaluate which OS to
use to keep a redundant disk array. With unreliable consumer-level hardware,
ZFS and the checksum feature are very interesting and the primary selling
point compared to a Linux setup, for as long as ZFS can generate enough
bandwidth from the drive array to saturate single gigabit ethernet.

My hardware at the moment is the "wrong" choice for Solaris/ZFS - PCI 3114
SATA controller on a 32-bit AthlonXP, according to many posts I found.
However, since dd over raw disk is capable of extracting 75+MB/s from this
setup, I keep feeling that surely I must be able to get at least that much
from reading a pair of striped or mirrored ZFS drives. But I can't - single
drive or 2-drive stripes or mirrors, I only get around 34MB/s going through
ZFS. (I made sure mirror was rebuilt and I resilvered the stripes.)
Everything is stock Nevada b63 installation, so I haven't messed it up with
misguided tuning attempts. Don't know if it matters, but test file was
created originally from /dev/random. Compression is off, and everything is
default. CPU utilization remains low at all times (haven't seen it go over
25%).
Post by j***@sun.com
This certainly isn't the case on my machine.
$ /usr/bin/time dd if=/test/filebench/largefile2 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 1.3
user 0.0
sys 1.2
# /usr/bin/time dd if=/dev/dsk/c0t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 22.3
user 0.0
sys 2.2
This looks like 56 MB/s on the /dev/dsk and 961 MB/s on the pool.
My pool is configured into a 46 disk RAID-0 stripe. I'm going to omit
the zpool status output for the sake of brevity.
Post by Marko Milisavljevic
What I am seeing is that ZFS performance for sequential access is
about 45% of raw disk access, while UFS (as well as ext3 on Linux) is
around 70%. For workload consisting mostly of reading large files
sequentially, it would seem then that ZFS is the wrong tool
performance-wise. But, it could be just my setup, so I would
appreciate more data points.
This isn't what we've observed in much of our performance testing.
It may be a problem with your config, although I'm not an expert on
storage configurations. Would you mind providing more details about
your controller, disks, and machine setup?
-j
j***@sun.com
2007-05-15 01:42:36 UTC
Permalink
Marko,

I tried this experiment again using 1 disk and got nearly identical
times:

# /usr/bin/time dd if=/dev/dsk/c0t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 21.4
user 0.0
sys 2.4

$ /usr/bin/time dd if=/test/filebench/testfile of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 21.0
user 0.0
sys 0.7
[I]t is not possible for dd to meaningfully access multiple-disk
configurations without going through the file system. I find it
curious that there is such a large slowdown by going through file
system (with single drive configuration), especially compared to UFS
or ext3.
Comparing a filesystem to raw dd access isn't a completely fair
comparison either. Few filesystems actually layout all of their data
and metadata so that every read is a completely sequential read.
I simply have a small SOHO server and I am trying to evaluate which OS to
use to keep a redundant disk array. With unreliable consumer-level hardware,
ZFS and the checksum feature are very interesting and the primary selling
point compared to a Linux setup, for as long as ZFS can generate enough
bandwidth from the drive array to saturate single gigabit ethernet.
I would take Bart's reccomendation and go with Solaris on something like a
dual-core box with 4 disks.
My hardware at the moment is the "wrong" choice for Solaris/ZFS - PCI 3114
SATA controller on a 32-bit AthlonXP, according to many posts I found.
Bill Moore lists some controller reccomendations here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2006-March/016874.html
However, since dd over raw disk is capable of extracting 75+MB/s from this
setup, I keep feeling that surely I must be able to get at least that much
from reading a pair of striped or mirrored ZFS drives. But I can't - single
drive or 2-drive stripes or mirrors, I only get around 34MB/s going through
ZFS. (I made sure mirror was rebuilt and I resilvered the stripes.)
Maybe this is a problem with your controller? What happens when you
have two simultaneous dd's to different disks running? This would
simulate the case where you're reading from the two disks at the same
time.

-j
Marko Milisavljevic
2007-05-15 05:48:32 UTC
Permalink
I am very grateful to everyone who took the time to run a few tests to help
me figure what is going on. As per j's suggestions, I tried some
simultaneous reads, and a few other things, and I am getting interesting and
confusing results.

All tests are done using two Seagate 320G drives on sil3114. In each test I
am using dd if=.... of=/dev/null bs=128k count=10000. Each drive is freshly
formatted with one 2G file copied to it. That way dd from raw disk and from
file are using roughly same area of disk. I tried using raw, zfs and ufs,
single drives and two simultaneously (just executing dd commands in separate
terminal windows). These are snapshots of iostat -xnczpm 3 captured
somewhere in the middle of the operation. I am not bothering to report CPU%
as it never rose over 50%, and was uniformly proportional to reported
throughput.

single drive raw:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1378.4 0.0 77190.7 0.0 0.0 1.7 0.0 1.2 0 98 c0d1

single drive, ufs file
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1255.1 0.0 69949.6 0.0 0.0 1.8 0.0 1.4 0 100 c0d0

Small slowdown, but pretty good.

single drive, zfs file
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
258.3 0.0 33066.6 0.0 33.0 2.0 127.7 7.7 100 100 c0d1

Now that is odd. Why so much waiting? Also, unlike with raw or UFS, kr/s /
r/s gives 256K, as I would imagine it should.

simultaneous raw:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
797.0 0.0 44632.0 0.0 0.0 1.8 0.0 2.3 0 100 c0d0
795.7 0.0 44557.4 0.0 0.0 1.8 0.0 2.3 0 100 c0d1

This PCI interface seems to be saturated at 90MB/s. Adequate if the goal is
to serve files on gigabit SOHO network.

sumultaneous raw on c0d1 and ufs on c0d0:
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
722.4 0.0 40246.8 0.0 0.0 1.8 0.0 2.5 0 100 c0d0
717.1 0.0 40156.2 0.0 0.0 1.8 0.0 2.5 0 99 c0d1

hmm, can no longer get the 90MB/sec.

simultaneous zfs on c0d1 and raw on c0d0:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.7 0.0 1.8 0.0 0.0 0.0 0.1 0 0 c1d0
334.9 0.0 18756.0 0.0 0.0 1.9 0.0 5.5 0 97 c0d0
172.5 0.0 22074.6 0.0 33.0 2.0 191.3 11.6 100 100 c0d1

Everything is slow.

What happens if we throw onboard IDE interface into the mix?
simultaneous raw SATA and raw PATA:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1036.3 0.3 58033.9 0.3 0.0 1.6 0.0 1.6 0 99 c1d0
1422.6 0.0 79668.3 0.0 0.0 1.6 0.0 1.1 1 98 c0d0

Both at maximum throughput.

Read ZFS on SATA drive and raw disk on PATA interface:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1018.9 0.3 57056.1 4.0 0.0 1.7 0.0 1.7 0 99 c1d0
268.4 0.0 34353.1 0.0 33.0 2.0 122.9 7.5 100 100 c0d0

SATA is slower with ZFS as expected by now, but ATA remains at full speed.
So they are operating quite independantly. Except...

What if we read a UFS file from the PATA disk and ZFS from SATA:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
792.8 0.0 44092.9 0.0 0.0 1.8 0.0 2.2 1 98 c1d0
224.0 0.0 28675.2 0.0 33.0 2.0 147.3 8.9 100 100 c0d0

Now that is confusing! Why did SATA/ZFS slow down too? I've retried this a
number of times, not a fluke.

Finally, after reviewing all this, I've noticed another interesting bit...
whenever I read from raw disks or UFS files, SATA or PATA, kr/s over r/s is
56k, suggesting that underlying IO system is using that as some kind of a
native block size? (even though dd is requesting 128k). But when reading ZFS
files, this always comes to 128k, which is expected, since that is ZFS
default (and same thing happens regardless of bs= in dd). On the theory that
my system just doesn't like 128k reads (I'm desperate!), and that this would
explain the whole slowdown and wait/wsvc_t column, I tried changing recsize
to 32k and rewriting the test file. However, accessing ZFS files continues
to show 128k reads, and it is just as slow. Is there a way to either confirm
that the ZFS file in question is indeed written with 32k records or, even
better, to force ZFS to use 56k when accessing the disk. Or perhaps I just
misunderstand implications of iostat output.

I've repeated each of these tests a few times and doublechecked, and the
numbers, although snapshots of a point in time, fairly represent averages.

I have no idea what to make of all this, except that it ZFS has a problem
with this hardware/drivers that UFS and other traditional file systems,
don't. Is it a bug in the driver that ZFS is inadvertently exposing? A
specific feature that ZFS assumes the hardware to have, but it doesn't? Who
knows! I will have to give up on Solaris/ZFS on this hardware for now, but I
hope to try it again sometime in the future. I'll give FreeBSD/ZFS a spin to
see if it fares better (although at this point in its development it is
probably more risky then just sticking with Linux and missing out on ZFS).

(Another contributor suggested turning checksumming off - it made no
difference. Same for atime. Compression was always off.)
Post by j***@sun.com
Marko,
I tried this experiment again using 1 disk and got nearly identical
# /usr/bin/time dd if=/dev/dsk/c0t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 21.4
user 0.0
sys 2.4
$ /usr/bin/time dd if=/test/filebench/testfile of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 21.0
user 0.0
sys 0.7
[I]t is not possible for dd to meaningfully access multiple-disk
configurations without going through the file system. I find it
curious that there is such a large slowdown by going through file
system (with single drive configuration), especially compared to UFS
or ext3.
Comparing a filesystem to raw dd access isn't a completely fair
comparison either. Few filesystems actually layout all of their data
and metadata so that every read is a completely sequential read.
I simply have a small SOHO server and I am trying to evaluate which OS
to
use to keep a redundant disk array. With unreliable consumer-level
hardware,
ZFS and the checksum feature are very interesting and the primary
selling
point compared to a Linux setup, for as long as ZFS can generate enough
bandwidth from the drive array to saturate single gigabit ethernet.
I would take Bart's reccomendation and go with Solaris on something like a
dual-core box with 4 disks.
My hardware at the moment is the "wrong" choice for Solaris/ZFS - PCI
3114
SATA controller on a 32-bit AthlonXP, according to many posts I found.
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-March/016874.html
However, since dd over raw disk is capable of extracting 75+MB/s from
this
setup, I keep feeling that surely I must be able to get at least that
much
from reading a pair of striped or mirrored ZFS drives. But I can't -
single
drive or 2-drive stripes or mirrors, I only get around 34MB/s going
through
ZFS. (I made sure mirror was rebuilt and I resilvered the stripes.)
Maybe this is a problem with your controller? What happens when you
have two simultaneous dd's to different disks running? This would
simulate the case where you're reading from the two disks at the same
time.
-j
Nick G
2007-05-15 11:31:00 UTC
Permalink
Post by Marko Milisavljevic
I have no idea what to make of all
this, except that it ZFS has a problem with this
hardware/drivers that UFS and other traditional file
systems, don't. Is it a bug in the driver that
ZFS is inadvertently exposing? A specific feature
that ZFS assumes the hardware to have, but it
doesn't? Who knows! I will have to give up on
Solaris/ZFS on this hardware for now, but I hope to
try it again sometime in the future. I'll give
FreeBSD/ZFS a spin to see if it fares better
(although at this point in its development it is
probably more risky then just sticking with Linux and
missing out on ZFS).
If you do give FreeBSD a try, if just for the sake of seeing if ZFS continues to perform badly on your hardware, use the 200705 snapshot or newer, and make sure your turn off the debugging support that is built in to -CURRENT by default, ZFS seems to like _fast_ memory.

Make malloc behave like a release:
# cd /etc
# ln -s malloc.conf aj

Rebuild your kernel to disable sanity checks in -CURRENT, you could probably just comment out WITNESS* and INVARIANT*, but I wanted to test the equivalent of a production release system here, so I commented all of it out and recompiled.

#makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
#options KDB # Enable kernel debugger support.
#options DDB # Support DDB.
#options GDB # Support remote GDB.
#options INVARIANTS # Enable calls of extra sanity checking
#options INVARIANT_SUPPORT # Extra sanity checks of internal structures, required by INVARIANTS
#options WITNESS # Enable checks to detect deadlocks and cycles
#options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed


Your filesystem/data should be safe on FreeBSD right now since pretty much all of the core ZFS code is the same. That doesn't mean something else won't cause a panic/reboot, since it is a devel branch! You are right to be hesitant to put it into production for a client. If it's just for home use, I say go for it, I've been beating on it for a few days and have been pleasantly suprised. Obviously if you can trigger a panic, you'd want to reenable debugging if you care to fix it.


This message posted from opensolaris.org
j***@sun.com
2007-05-15 21:03:27 UTC
Permalink
Each drive is freshly formatted with one 2G file copied to it.
How are you creating each of these files?

Also, would you please include a the output from the isalist(1) command?
These are snapshots of iostat -xnczpm 3 captured somewhere in the
middle of the operation.
Have you double-checked that this isn't a measurement problem by
measuring zfs with zpool iostat (see zpool(1M)) and verifying that
outputs from both iostats match?
single drive, zfs file
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
258.3 0.0 33066.6 0.0 33.0 2.0 127.7 7.7 100 100 c0d1
Now that is odd. Why so much waiting? Also, unlike with raw or UFS, kr/s /
r/s gives 256K, as I would imagine it should.
Not sure. If we can figure out why ZFS is slower than raw disk access
in your case, it may explain why you're seeing these results.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
792.8 0.0 44092.9 0.0 0.0 1.8 0.0 2.2 1 98 c1d0
224.0 0.0 28675.2 0.0 33.0 2.0 147.3 8.9 100 100 c0d0
Now that is confusing! Why did SATA/ZFS slow down too? I've retried this a
number of times, not a fluke.
This could be cache interference. ZFS and UFS use different caches.

How much memory is in this box?
I have no idea what to make of all this, except that it ZFS has a problem
with this hardware/drivers that UFS and other traditional file systems,
don't. Is it a bug in the driver that ZFS is inadvertently exposing? A
specific feature that ZFS assumes the hardware to have, but it doesn't? Who
knows!
This may be a more complicated interaction than just ZFS and your
hardware. There are a number of layers of drivers underneath ZFS that
may also be interacting with your hardware in an unfavorable way.

If you'd like to do a little poking with MDB, we can see the features
that your SATA disks claim they support.

As root, type mdb -k, and then at the ">" prompt that appears, enter the
following command (this is one very long line):

*sata_hba_list::list sata_hba_inst_t satahba_next | ::print sata_hba_inst_t satahba_dev_port | ::array void* 32 | ::print void* | ::grep ".!=0" | ::print sata_cport_info_t cport_devp.cport_sata_drive | ::print -a sata_drive_info_t satadrv_features_support satadrv_settings satadrv_features_enabled

This should show satadrv_features_support, satadrv_settings, and
satadrv_features_enabled for each SATA disk on the system.

The values for these variables are defined in:

http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/sata/impl/sata.h

this is the relevant snippet for interpreting these values:

/*
* Device feature_support (satadrv_features_support)
*/
#define SATA_DEV_F_DMA 0x01
#define SATA_DEV_F_LBA28 0x02
#define SATA_DEV_F_LBA48 0x04
#define SATA_DEV_F_NCQ 0x08
#define SATA_DEV_F_SATA1 0x10
#define SATA_DEV_F_SATA2 0x20
#define SATA_DEV_F_TCQ 0x40 /* Non NCQ tagged queuing */

/*
* Device features enabled (satadrv_features_enabled)
*/
#define SATA_DEV_F_E_TAGGED_QING 0x01 /* Tagged queuing enabled */
#define SATA_DEV_F_E_UNTAGGED_QING 0x02 /* Untagged queuing enabled */

/*
* Drive settings flags (satdrv_settings)
*/
#define SATA_DEV_READ_AHEAD 0x0001 /* Read Ahead enabled */
#define SATA_DEV_WRITE_CACHE 0x0002 /* Write cache ON */
#define SATA_DEV_SERIAL_FEATURES 0x8000 /* Serial ATA feat. enabled */
#define SATA_DEV_ASYNCH_NOTIFY 0x2000 /* Asynch-event enabled */

This may give us more information if this is indeed a problem with
hardware/drivers supporting the right features.

-j
Marko Milisavljevic
2007-05-16 05:41:21 UTC
Permalink
Post by j***@sun.com
Each drive is freshly formatted with one 2G file copied to it.
How are you creating each of these files?
zpool create tank c0d0 c0d1; zfs create tank/test; cp ~/bigfile /tank/test/
Actual content of the file is random junk from /dev/random.
Post by j***@sun.com
Also, would you please include a the output from the isalist(1) command?
pentium_pro+mmx pentium_pro pentium+mmx pentium i486 i386 i86
Post by j***@sun.com
Have you double-checked that this isn't a measurement problem by
measuring zfs with zpool iostat (see zpool(1M)) and verifying that
outputs from both iostats match?
Both give same kb/s.
Post by j***@sun.com
How much memory is in this box?
1.5g, I can see in /var/adm/messages that it is recognized.
Post by j***@sun.com
As root, type mdb -k, and then at the ">" prompt that appears, enter the
*sata_hba_list::list sata_hba_inst_t satahba_next | ::print sata_hba_inst_t satahba_dev_port | ::array void* 32 | ::print void* | ::grep ".!=0" | ::print sata_cport_info_t cport_devp.cport_sata_drive | ::print -a sata_drive_info_t satadrv_features_support satadrv_settings satadrv_features_enabled
This gives me "mdb: failed to dereference symbol: unknown symbol
name". I don't know enough about the syntax here to try to isolate
which token it is complaining about. But, I don't know if my PCI/SATA
card is going through sd driver, if that is what commands above
assume... my understanding is that sil3114 goes through ata driver, as
per this blog: http://blogs.sun.com/mlf/entry/ata_on_solaris_x86_at

If there is any other testing I can do, I would be happy to.
j***@sun.com
2007-05-16 17:26:14 UTC
Permalink
Post by Marko Milisavljevic
Post by j***@sun.com
*sata_hba_list::list sata_hba_inst_t satahba_next | ::print
sata_hba_inst_t satahba_dev_port | ::array void* 32 | ::print void* |
::grep ".!=0" | ::print sata_cport_info_t cport_devp.cport_sata_drive |
::print -a sata_drive_info_t satadrv_features_support satadrv_settings
satadrv_features_enabled
This gives me "mdb: failed to dereference symbol: unknown symbol
name".
You may not have the SATA module installed. If you type:

::modinfo ! grep sata

and don't get any output, your sata driver is attached some other way.

My apologies for the confusion.

-K
Richard Elling
2007-05-17 14:50:26 UTC
Permalink
queuing theory should explain this rather nicely. iostat measures
%busy by counting if there is an entry in the queue for the clock
ticks. There are two queues, one in the controller and one on the
disk. As you can clearly see the way ZFS pushes the load is very
different than dd or UFS.
-- richard
Post by Marko Milisavljevic
I am very grateful to everyone who took the time to run a few tests to
help me figure what is going on. As per j's suggestions, I tried some
simultaneous reads, and a few other things, and I am getting interesting
and confusing results.
All tests are done using two Seagate 320G drives on sil3114. In each
test I am using dd if=.... of=/dev/null bs=128k count=10000. Each drive
is freshly formatted with one 2G file copied to it. That way dd from raw
disk and from file are using roughly same area of disk. I tried using
raw, zfs and ufs, single drives and two simultaneously (just executing
dd commands in separate terminal windows). These are snapshots of iostat
-xnczpm 3 captured somewhere in the middle of the operation. I am not
bothering to report CPU% as it never rose over 50%, and was uniformly
proportional to reported throughput.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1378.4 0.0 77190.7 0.0 0.0 1.7 0.0 1.2 0 98 c0d1
single drive, ufs file
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1255.1 0.0 69949.6 0.0 0.0 1.8 0.0 1.4 0 100 c0d0
Small slowdown, but pretty good.
single drive, zfs file
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
258.3 0.0 33066.6 0.0 33.0 2.0 127.7 7.7 100 100 c0d1
Now that is odd. Why so much waiting? Also, unlike with raw or UFS, kr/s
/ r/s gives 256K, as I would imagine it should.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
797.0 0.0 44632.0 0.0 0.0 1.8 0.0 2.3 0 100 c0d0
795.7 0.0 44557.4 0.0 0.0 1.8 0.0 2.3 0 100 c0d1
This PCI interface seems to be saturated at 90MB/s. Adequate if the goal
is to serve files on gigabit SOHO network.
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
722.4 0.0 40246.8 0.0 0.0 1.8 0.0 2.5 0 100 c0d0
717.1 0.0 40156.2 0.0 0.0 1.8 0.0 2.5 0 99 c0d1
hmm, can no longer get the 90MB/sec.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.7 0.0 1.8 0.0 0.0 0.0 0.1 0 0 c1d0
334.9 0.0 18756.0 0.0 0.0 1.9 0.0 5.5 0 97 c0d0
172.5 0.0 22074.6 0.0 33.0 2.0 191.3 11.6 100 100 c0d1
Everything is slow.
What happens if we throw onboard IDE interface into the mix?
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1036.3 0.3 58033.9 0.3 0.0 1.6 0.0 1.6 0 99 c1d0
1422.6 0.0 79668.3 0.0 0.0 1.6 0.0 1.1 1 98 c0d0
Both at maximum throughput.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1018.9 0.3 57056.1 4.0 0.0 1.7 0.0 1.7 0 99 c1d0
268.4 0.0 34353.1 0.0 33.0 2.0 122.9 7.5 100 100 c0d0
SATA is slower with ZFS as expected by now, but ATA remains at full
speed. So they are operating quite independantly. Except...
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
792.8 0.0 44092.9 0.0 0.0 1.8 0.0 2.2 1 98 c1d0
224.0 0.0 28675.2 0.0 33.0 2.0 147.3 8.9 100 100 c0d0
Now that is confusing! Why did SATA/ZFS slow down too? I've retried this
a number of times, not a fluke.
Finally, after reviewing all this, I've noticed another interesting
bit... whenever I read from raw disks or UFS files, SATA or PATA, kr/s
over r/s is 56k, suggesting that underlying IO system is using that as
some kind of a native block size? (even though dd is requesting 128k).
But when reading ZFS files, this always comes to 128k, which is
expected, since that is ZFS default (and same thing happens regardless
of bs= in dd). On the theory that my system just doesn't like 128k reads
(I'm desperate!), and that this would explain the whole slowdown and
wait/wsvc_t column, I tried changing recsize to 32k and rewriting the
test file. However, accessing ZFS files continues to show 128k reads,
and it is just as slow. Is there a way to either confirm that the ZFS
file in question is indeed written with 32k records or, even better, to
force ZFS to use 56k when accessing the disk. Or perhaps I just
misunderstand implications of iostat output.
I've repeated each of these tests a few times and doublechecked, and the
numbers, although snapshots of a point in time, fairly represent averages.
I have no idea what to make of all this, except that it ZFS has a
problem with this hardware/drivers that UFS and other traditional file
systems, don't. Is it a bug in the driver that ZFS is inadvertently
exposing? A specific feature that ZFS assumes the hardware to have, but
it doesn't? Who knows! I will have to give up on Solaris/ZFS on this
hardware for now, but I hope to try it again sometime in the future.
I'll give FreeBSD/ZFS a spin to see if it fares better (although at this
point in its development it is probably more risky then just sticking
with Linux and missing out on ZFS).
(Another contributor suggested turning checksumming off - it made no
difference. Same for atime. Compression was always off.)
Marko,
I tried this experiment again using 1 disk and got nearly identical
# /usr/bin/time dd if=/dev/dsk/c0t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 21.4
user 0.0
sys 2.4
$ /usr/bin/time dd if=/test/filebench/testfile of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 21.0
user 0.0
sys 0.7
[I]t is not possible for dd to meaningfully access multiple-disk
configurations without going through the file system. I find it
curious that there is such a large slowdown by going through file
system (with single drive configuration), especially compared to UFS
or ext3.
Comparing a filesystem to raw dd access isn't a completely fair
comparison either. Few filesystems actually layout all of their data
and metadata so that every read is a completely sequential read.
I simply have a small SOHO server and I am trying to evaluate
which OS to
use to keep a redundant disk array. With unreliable
consumer-level hardware,
ZFS and the checksum feature are very interesting and the primary
selling
point compared to a Linux setup, for as long as ZFS can generate
enough
bandwidth from the drive array to saturate single gigabit ethernet.
I would take Bart's reccomendation and go with Solaris on something like a
dual-core box with 4 disks.
My hardware at the moment is the "wrong" choice for Solaris/ZFS -
PCI 3114
SATA controller on a 32-bit AthlonXP, according to many posts I
found.
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-March/016874.html
<http://mail.opensolaris.org/pipermail/zfs-discuss/2006-March/016874.html>
However, since dd over raw disk is capable of extracting 75+MB/s
from this
setup, I keep feeling that surely I must be able to get at least
that much
from reading a pair of striped or mirrored ZFS drives. But I
can't - single
drive or 2-drive stripes or mirrors, I only get around 34MB/s
going through
ZFS. (I made sure mirror was rebuilt and I resilvered the stripes.)
Maybe this is a problem with your controller? What happens when you
have two simultaneous dd's to different disks running? This would
simulate the case where you're reading from the two disks at the same
time.
-j
------------------------------------------------------------------------
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Marko Milisavljevic
2007-05-14 22:16:32 UTC
Permalink
I missed an important conclusion from j's data, and that is that single disk
raw access gives him 56MB/s, and RAID 0 array gives him 961/46=21MB/s per
disk, which comes in at 38% of potential performance. That is in the
ballpark of getting 45% of potential performance, as I am seeing with my
puny setup of single or dual drives. Of course, I don't expect a complex
file system to match raw disk dd performance, but it doesn't compare
favourably to common file systems like UFS or ext3, so the question remains,
is ZFS overhead normally this big? That would mean that one needs to have at
least 4-5 way stripe to generate enough data to saturate gigabit ethernet,
compared to 2-3 way stripe on a "lesser" filesystem, a possibly important
consideration in SOHO situation.
Post by j***@sun.com
This certainly isn't the case on my machine.
$ /usr/bin/time dd if=/test/filebench/largefile2 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 1.3
user 0.0
sys 1.2
# /usr/bin/time dd if=/dev/dsk/c0t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 22.3
user 0.0
sys 2.2
This looks like 56 MB/s on the /dev/dsk and 961 MB/s on the pool.
My pool is configured into a 46 disk RAID-0 stripe. I'm going to omit
the zpool status output for the sake of brevity.
Post by Marko Milisavljevic
What I am seeing is that ZFS performance for sequential access is
about 45% of raw disk access, while UFS (as well as ext3 on Linux) is
around 70%. For workload consisting mostly of reading large files
sequentially, it would seem then that ZFS is the wrong tool
performance-wise. But, it could be just my setup, so I would
appreciate more data points.
This isn't what we've observed in much of our performance testing.
It may be a problem with your config, although I'm not an expert on
storage configurations. Would you mind providing more details about
your controller, disks, and machine setup?
-j
Richard Elling
2007-05-14 22:52:00 UTC
Permalink
Post by Marko Milisavljevic
I missed an important conclusion from j's data, and that is that single
disk raw access gives him 56MB/s, and RAID 0 array gives him
961/46=21MB/s per disk, which comes in at 38% of potential performance.
That is in the ballpark of getting 45% of potential performance, as I am
seeing with my puny setup of single or dual drives. Of course, I don't
expect a complex file system to match raw disk dd performance, but it
doesn't compare favourably to common file systems like UFS or ext3, so
the question remains, is ZFS overhead normally this big? That would mean
that one needs to have at least 4-5 way stripe to generate enough data
to saturate gigabit ethernet, compared to 2-3 way stripe on a "lesser"
filesystem, a possibly important consideration in SOHO situation.
Could you post iostat data for these runs?

Also, as I suggested previously, try with checksum off. AthlonXP doesn't
have a reputation as a speed deamon.

BTW, for 7,200 rpm drives, which are typical in desktops, 56 MBytes/s
isn't bad. The media speed will range from perhaps [30-40]-[60-75] MBytes/s
judging from a quick scan of disk vendor datasheets. In other words, it
would not surprise me to see 4-5 way stripe being required to keep a
GbE saturated.
-- richard
Marko Milisavljevic
2007-05-14 23:27:39 UTC
Permalink
Right now, the AthlonXP machine is booted into Linux, and I'm getting same
raw speed as when it is in Solaris, from PCI Sil3114 with Seagate 320G (
7200.10):

dd if=/dev/sdb of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes (1.3 GB) copied, 16.7756 seconds, 78.1 MB/s

sudo dd if=./test.mov of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes (1.3 GB) copied, 24.2731 seconds, 54.0 MB/s <-- some
overhead compared to raw speed of same disk above

same machine, onboard ATA, Seagate 120G:
dd if=/dev/hda of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes (1.3 GB) copied, 22.5892 seconds, 58.0 MB/s

On another machine with Pentium D 3.0GHz and ICH7 onboard SATA in AHCI mode,
running Darwin OS:

from a Seagate 500G (7200.10):
dd if=/dev/rdisk0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes transferred in 17.697512 secs (74062388 bytes/sec)

same disk, access through file system (HFS+)
dd if=./Summer\ 2006\ with\ Cohen\ 4 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes transferred in 20.381901 secs (64308035 bytes/sec) <- very
small overhead compared to raw access above!

same Intel machine, Seagate 200G (7200.8, I think):
dd if=/dev/rdisk1 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes transferred in 20.850229 secs (62863578 bytes/sec)

Modern disk drives are definitely fast and pushing close to 80MB/s raw
performance. And some file systems can get over 85% of that with simple
sequential access. So far, on these particular hardware and software
combinations, I have, filesystem performance as percentage of raw disk
performance for sequential unchached read:

HFS+: 86%
ext3 and UFS: 70%
ZFS: 45%
Post by Richard Elling
Post by Marko Milisavljevic
I missed an important conclusion from j's data, and that is that single
disk raw access gives him 56MB/s, and RAID 0 array gives him
961/46=21MB/s per disk, which comes in at 38% of potential performance.
That is in the ballpark of getting 45% of potential performance, as I am
seeing with my puny setup of single or dual drives. Of course, I don't
expect a complex file system to match raw disk dd performance, but it
doesn't compare favourably to common file systems like UFS or ext3, so
the question remains, is ZFS overhead normally this big? That would mean
that one needs to have at least 4-5 way stripe to generate enough data
to saturate gigabit ethernet, compared to 2-3 way stripe on a "lesser"
filesystem, a possibly important consideration in SOHO situation.
Could you post iostat data for these runs?
Also, as I suggested previously, try with checksum off. AthlonXP doesn't
have a reputation as a speed deamon.
BTW, for 7,200 rpm drives, which are typical in desktops, 56 MBytes/s
isn't bad. The media speed will range from perhaps [30-40]-[60-75] MBytes/s
judging from a quick scan of disk vendor datasheets. In other words, it
would not surprise me to see 4-5 way stripe being required to keep a
GbE saturated.
-- richard
Bart Smaalders
2007-05-14 22:59:07 UTC
Permalink
Post by Marko Milisavljevic
I missed an important conclusion from j's data, and that is that single
disk raw access gives him 56MB/s, and RAID 0 array gives him
961/46=21MB/s per disk, which comes in at 38% of potential performance.
That is in the ballpark of getting 45% of potential performance, as I am
seeing with my puny setup of single or dual drives. Of course, I don't
expect a complex file system to match raw disk dd performance, but it
doesn't compare favourably to common file systems like UFS or ext3, so
the question remains, is ZFS overhead normally this big? That would mean
that one needs to have at least 4-5 way stripe to generate enough data
to saturate gigabit ethernet, compared to 2-3 way stripe on a "lesser"
filesystem, a possibly important consideration in SOHO situation.
I don't see this on my system, but it has more CPU (dual
core 2.6 GHz). It saturates a GB net w/ 4 drives & samba,
not working hard at all. A thumper does 2 GB/sec w 2 dual
core CPUs.

Do you have compression enabled? This can be a choke point
for weak CPUs.

- Bart


Bart Smaalders Solaris Kernel Performance
***@cyber.eng.sun.com http://blogs.sun.com/barts
Al Hopper
2007-05-14 22:44:48 UTC
Permalink
Post by Marko Milisavljevic
To reply to my own message.... this article offers lots of insight into why dd access directly through raw disk is fast, while accessing a file through the file system may be slow.
http://www.informit.com/articles/printerfriendly.asp?p=606585&rl=1
So, I guess what I'm wondering now is, does it happen to everyone that ZFS is under half the speed of raw disk access? What speeds are other people getting trying to dd a file through zfs file system? Something like
dd if=/pool/mount/file of=/dev/null bs=128k (assuming you are using default ZFS block size)
dd if=/dev/dsk/diskinzpool of=/dev/null bs=128k count=10000
If you could please post your MB/s and show output of zpool status so we
can see your disk configuration I would appreciate it. Please use file
that is 100MB or more - result is be too random with small files. Also
make sure zfs is not caching the file already!
# ptime dd if=./allhomeal20061209_01.tar of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 6.407
user 0.008
sys 1.624

pool: tank
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0
c2t4d0 ONLINE 0 0 0

3-way mirror:

10000+0 records in
10000+0 records out

real 12.500
user 0.007
sys 1.216

2-way mirror:

10000+0 records in
10000+0 records out

real 18.356
user 0.006
sys 0.935


# psrinfo -v
Status of virtual processor 0 as of: 05/14/2007 17:31:18
on-line since 05/03/2007 08:01:21.
The i386 processor operates at 2009 MHz,
and has an i387 compatible floating point processor.
Status of virtual processor 1 as of: 05/14/2007 17:31:18
on-line since 05/03/2007 08:01:24.
The i386 processor operates at 2009 MHz,
and has an i387 compatible floating point processor.
Status of virtual processor 2 as of: 05/14/2007 17:31:18
on-line since 05/03/2007 08:01:26.
The i386 processor operates at 2009 MHz,
and has an i387 compatible floating point processor.
Status of virtual processor 3 as of: 05/14/2007 17:31:18
on-line since 05/03/2007 08:01:28.
The i386 processor operates at 2009 MHz,
and has an i387 compatible floating point processor.
Post by Marko Milisavljevic
What I am seeing is that ZFS performance for sequential access is about 45% of raw disk access, while UFS (as well as ext3 on Linux) is around 70%. For workload consisting mostly of reading large files sequentially, it would seem then that ZFS is the wrong tool performance-wise. But, it could be just my setup, so I would appreciate more data points.
Regards,

Al Hopper Logical Approach Inc, Plano, TX. ***@logical-approach.com
Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Marko Milisavljevic
2007-05-14 22:53:01 UTC
Permalink
Thank you, Al.

Would you mind also doing:

ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=10000

to see the raw performance of underlying hardware.
Post by Al Hopper
# ptime dd if=./allhomeal20061209_01.tar of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 6.407
user 0.008
sys 1.624
pool: tank
state: ONLINE
scrub: none requested
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0
c2t4d0 ONLINE 0 0 0
10000+0 records in
10000+0 records out
real 12.500
user 0.007
sys 1.216
10000+0 records in
10000+0 records out
real 18.356
user 0.006
sys 0.935
Al Hopper
2007-05-15 03:16:08 UTC
Permalink
Post by Marko Milisavljevic
Thank you, Al.
ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=10000
# ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=10000

real 20.046
user 0.013
sys 3.568
Post by Marko Milisavljevic
to see the raw performance of underlying hardware.
Regards,

Al Hopper
Jürgen Keil
2007-05-15 17:13:53 UTC
Permalink
Post by Marko Milisavljevic
ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=10000
to see the raw performance of underlying hardware.
This dd command is reading from the block device,
which might cache dataand probably splits requests
into "maxphys" pieces (which happens to be 56K on an
x86 box).

I'd read from the raw device, /dev/rdsk/c2t1d0 ...


This message posted from opensolaris.org
Jonathan Edwards
2007-05-15 17:35:35 UTC
Permalink
Post by Jürgen Keil
ptime dd if=3D/dev/dsk/c2t1d0 of=3D/dev/null bs=3D128k count=3D100=
00
Post by Jürgen Keil
to see the raw performance of underlying hardware.
This dd command is reading from the block device,
which might cache dataand probably splits requests
into "maxphys" pieces (which happens to be 56K on an
x86 box).
to increase this to say 8MB, add the following to /etc/system:

set maxphys=3D0x800000

and you'll probably want to increase sd_max_xfer_size as
well (should be 256K on x86/x64) .. add the following to
/kernel/drv/sd.conf:

sd_max_xfer_size=3D0x800000;

then reboot to get the kernel and sd tunings to take.

---
=2Eje

btw - the defaults on sparc:
maxphys =3D 128K
ssd_max_xfer_size =3D maxphys
sd_max_xfer_size =3D maxphys
Marko Milisavljevic
2007-05-16 05:14:40 UTC
Permalink
I tried as you suggested, but I notice that output from iostat while
doing dd if=/dev/dsk/... still shows that reading is done in 56k
chunks. I haven't see any change in performance. Perhaps iostat
doesn't say what I think it does. Using dd if=/dev/rdsk/.. gives 256k,
and dd if=zfsfile gives 128k read sizes.
Post by Jürgen Keil
Post by Marko Milisavljevic
ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=10000
to see the raw performance of underlying hardware.
This dd command is reading from the block device,
which might cache dataand probably splits requests
into "maxphys" pieces (which happens to be 56K on an
x86 box).
set maxphys=0x800000
and you'll probably want to increase sd_max_xfer_size as
well (should be 256K on x86/x64) .. add the following to
sd_max_xfer_size=0x800000;
then reboot to get the kernel and sd tunings to take.
---
.je
maxphys = 128K
ssd_max_xfer_size = maxphys
sd_max_xfer_size = maxphys
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Torrey McMahon
2007-05-19 20:49:42 UTC
Permalink
Post by Jürgen Keil
Post by Marko Milisavljevic
ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=10000
to see the raw performance of underlying hardware.
This dd command is reading from the block device,
which might cache dataand probably splits requests
into "maxphys" pieces (which happens to be 56K on an
x86 box).
set maxphys=0x800000
and you'll probably want to increase sd_max_xfer_size as
well (should be 256K on x86/x64) .. add the following to
sd_max_xfer_size=0x800000;
then reboot to get the kernel and sd tunings to take.
---
.je
maxphys = 128K
ssd_max_xfer_size = maxphys
sd_max_xfer_size = maxphys
Maybe we should file a bug to increase the max transfer request sizes?
Ian Collins
2007-05-14 23:15:15 UTC
Permalink
Post by Marko Milisavljevic
To reply to my own message.... this article offers lots of insight into why dd access directly through raw disk is fast, while accessing a file through the file system may be slow.
http://www.informit.com/articles/printerfriendly.asp?p=606585&rl=1
So, I guess what I'm wondering now is, does it happen to everyone that ZFS is under half the speed of raw disk access? What speeds are other people getting trying to dd a file through zfs file system? Something like
dd if=/pool/mount/file of=/dev/null bs=128k (assuming you are using default ZFS block size)
dd if=/dev/dsk/diskinzpool of=/dev/null bs=128k count=10000
Testing on a old Athlon MP box, two U160 10K SCSI drives.

bash-3.00# time dd if=/dev/dsk/c2t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 0m44.470s
user 0m0.018s
sys 0m8.290s

time dd if=/test/play/sol-nv-b62-x86-dvd.iso of=/dev/null bs=128k
count=10000
10000+0 records in
10000+0 records out

real 0m22.714s
user 0m0.020s
sys 0m3.228s

zpool status
pool: test
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0

Ian
Marko Milisavljevic
2007-05-14 23:39:30 UTC
Permalink
Thank you, Ian,

You are getting ZFS over 2-disk RAID-0 to be twice as fast as dd raw disk
read on one disk, which sounds more encouraging. But, there is something odd
with dd from raw drive - it is only 28MB/s or so, if I divided that right? I
would expect it to be around 100MB/s on 10K drives, or at least that should
be roughly potential throughput rate. Compared to throughput from ZFS 2-disk
RAID-0 which is showing 57MB/s. Any idea why raw dd read is so slow?

Also, I wonder if everyone is using different dd command then I am - I get
summary line that shows elapsed time and MB/s.
Post by Marko Milisavljevic
Post by Marko Milisavljevic
To reply to my own message.... this article offers lots of insight into
why dd access directly through raw disk is fast, while accessing a file
through the file system may be slow.
Post by Marko Milisavljevic
http://www.informit.com/articles/printerfriendly.asp?p=606585&rl=1
So, I guess what I'm wondering now is, does it happen to everyone that
ZFS is under half the speed of raw disk access? What speeds are other people
getting trying to dd a file through zfs file system? Something like
Post by Marko Milisavljevic
dd if=/pool/mount/file of=/dev/null bs=128k (assuming you are using
default ZFS block size)
Post by Marko Milisavljevic
dd if=/dev/dsk/diskinzpool of=/dev/null bs=128k count=10000
Testing on a old Athlon MP box, two U160 10K SCSI drives.
bash-3.00# time dd if=/dev/dsk/c2t0d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 0m44.470s
user 0m0.018s
sys 0m8.290s
time dd if=/test/play/sol-nv-b62-x86-dvd.iso of=/dev/null bs=128k
count=10000
10000+0 records in
10000+0 records out
real 0m22.714s
user 0m0.020s
sys 0m3.228s
zpool status
pool: test
state: ONLINE
scrub: none requested
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
Ian
Nick G
2007-05-15 01:19:27 UTC
Permalink
Don't know how much this will help, but my results:

Ultra 20 we just got at work:

# uname -a
SunOS unknown 5.10 Generic_118855-15 i86pc i386 i86pc

raw disk
dd if=/dev/dsk/c1d0s6 of=/dev/null bs=128k count=10000 0.00s user 2.16s system 14% cpu 15.131 total

1,280,000k in 15.131 seconds
84768k/s

through filesystem
dd if=testfile of=/dev/null bs=128k count=10000 0.01s user 0.88s system 4% cpu 19.666 total

1,280,000k in 19.666 seconds
65087k/s


AMD64 Freebsd 7 on a Lenovo something or other, Athlon X2 3800+

uname -a
FreeBSD 7.0-CURRENT-200705 FreeBSD 7.0-CURRENT-200705 #0: Fri May 11 14:41:37 UTC 2007 root@:/usr/src/sys/amd64/compile/ZFS amd64

raw disk
dd if=/dev/ad6p1 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes transferred in 17.126926 secs (76529787 bytes/sec)
(74735k/s)

filesystem
# dd of=/dev/null if=testfile bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes transferred in 17.174395 secs (76318263 bytes/sec)
(74529k/s)

Odd to say the least since "du" for instance is faster on Solaris ZFS...

FWIW Freebsd is running version 6 of ZFS and the unpatched but _new_ Ultra 20 is running version 2 of ZFS according to zdb


Make sure you're all patched up?


This message posted from opensolaris.org
Matthew Ahrens
2007-05-16 02:10:04 UTC
Permalink
Post by Marko Milisavljevic
I was trying to simply test bandwidth that Solaris/ZFS (Nevada b63) can
deliver from a drive, and doing this: dd if=(raw disk) of=/dev/null gives
me around 80MB/s, while dd if=(file on ZFS) of=/dev/null gives me only
35MB/s!?.
Our experience is that ZFS gets very close to raw performance for streaming
reads (assuming that there is adequate CPU and memory available).

When doing reads, prefetching (and thus caching) is a critical component of
performance. It may be that ZFS's prefetching or caching is misbehaving somehow.

Your machine is 32-bit, right? This could be causing some caching pain...
How much memory do you have? While you're running the test on ZFS, can you
send the output of:

echo ::memstat | mdb -k
echo ::arc | mdb -k

Next, try running your test with prefetch disabled, by putting
set zfs:zfs_prefetch_disable=1
in /etc/system and rebooting before running your test. Send the 'iostat
-xnpcz' output while this test is running.

Finally, on modern drive the streaming performance can vary by up to 2x when
reading the outside vs. the inside of the disk. If your pool had been used
before you created your test file, it could be laid out on the inside part of
the disk. Then you would be comparing raw reads of the outside of the disk
vs. zfs reads of the inside of the disk. When the pool is empty, ZFS will
start allocating from the outside, so you can try destroying and recreating
your pool and creating the file on the fresh pool. Alternatively, create a
small partition (say, 10% of the disk size) and do your tests on that to
ensure that the file is not far from where your raw reads are going.

Let us know how that goes.

--matt
Marko Milisavljevic
2007-05-16 05:09:05 UTC
Permalink
Hello Matthew,

Yes, my machine is 32-bit, with 1.5G of RAM.

-bash-3.00# echo ::memstat | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 123249 481 32%
Anon 33704 131 9%
Exec and libs 7637 29 2%
Page cache 1116 4 0%
Free (cachelist) 222661 869 57%
Free (freelist) 2685 10 1%

Total 391052 1527
Physical 391051 1527

-bash-3.00# echo ::arc | mdb -k
{
anon = -759566176
mru = -759566136
mru_ghost = -759566096
mfu = -759566056
mfu_ghost = -759566016
size = 0x17f20c00
p = 0x160ef900
c = 0x17f16ae0
c_min = 0x4000000
c_max = 0x1da00000
hits = 0x353b
misses = 0x264b
deleted = 0x13bc
recycle_miss = 0x31
mutex_miss = 0
evict_skip = 0
hash_elements = 0x127b
hash_elements_max = 0x1a19
hash_collisions = 0x61
hash_chains = 0x4c
hash_chain_max = 0x1
no_grow = 1
}

now lets try:
set zfs:zfs_prefetch_disable=1

bingo!

r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
609.0 0.0 77910.0 0.0 0.0 0.8 0.0 1.4 0 83 c0d0

only 1-2 % slower then dd from /dev/dsk. Do you think this is general
32-bit problem, or specific to this combination of hardware? I am
using PCI/SATA Sil3114 card, and other then ZFS, performance of this
interface has some limitations in Solaris. That is, single drive gives
80MB/s, but doing dd /dev/dsk/xyz simultaneously on 2 drives attached
to the card gives only 46MB/s each. On Linux, however, that gives
60MB/s each, close to saturating theoretical throughput of PCI bus.
Having both drives in zpool stripe gives, with prefetch disabled,
close to 45MB/s each through dd from zfs file. I think that under
Solaris, this card is accessed through ATA driver.

There shouldn't be any issues on inside vs outside. all the reading is
done on the first gig or two of the drive, as there is nothing else on
them, except one 2 gig file. (well, i'm assuming simple copy onto a
newly formatted zfs drive puts it at start of the drive.) Drives are
completely owned by ZFS, using zpool create c0d0 c0d1

Finally, should I file a bug somewhere regarding prefetch, or is this
a known issue?

Many thanks.
Post by Matthew Ahrens
Post by Marko Milisavljevic
I was trying to simply test bandwidth that Solaris/ZFS (Nevada b63) can
deliver from a drive, and doing this: dd if=(raw disk) of=/dev/null gives
me around 80MB/s, while dd if=(file on ZFS) of=/dev/null gives me only
35MB/s!?.
Our experience is that ZFS gets very close to raw performance for streaming
reads (assuming that there is adequate CPU and memory available).
When doing reads, prefetching (and thus caching) is a critical component of
performance. It may be that ZFS's prefetching or caching is misbehaving somehow.
Your machine is 32-bit, right? This could be causing some caching pain...
How much memory do you have? While you're running the test on ZFS, can you
echo ::memstat | mdb -k
echo ::arc | mdb -k
Next, try running your test with prefetch disabled, by putting
set zfs:zfs_prefetch_disable=1
in /etc/system and rebooting before running your test. Send the 'iostat
-xnpcz' output while this test is running.
Finally, on modern drive the streaming performance can vary by up to 2x when
reading the outside vs. the inside of the disk. If your pool had been used
before you created your test file, it could be laid out on the inside part of
the disk. Then you would be comparing raw reads of the outside of the disk
vs. zfs reads of the inside of the disk. When the pool is empty, ZFS will
start allocating from the outside, so you can try destroying and recreating
your pool and creating the file on the fresh pool. Alternatively, create a
small partition (say, 10% of the disk size) and do your tests on that to
ensure that the file is not far from where your raw reads are going.
Let us know how that goes.
--matt
Marko Milisavljevic
2007-05-16 09:47:46 UTC
Permalink
Got excited too quickly on one thing... reading single zfs file does give me
almost same speed as dd /dev/dsk... around 78MB/s... however, creating a
2-drive stripe, still doesn't perform as well as it ought to:

r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
294.3 0.0 37675.6 0.0 0.0 0.4 0.0 1.4 0 40 c3d0
293.0 0.0 37504.9 0.0 0.0 0.4 0.0 1.4 0 40 c3d1

Simultaneous dd on those 2 drives from /dev/dsk runs at 46MB/s per drive.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
800.4 0.0 44824.6 0.0 0.0 1.8 0.0 2.2 0 99 c3d0
792.1 0.0 44357.9 0.0 0.0 1.8 0.0 2.2 0 98 c3d1

(and in Linux it saturates PCI bus at 60MB/s per drive)
Post by Matthew Ahrens
set zfs:zfs_prefetch_disable=1
bingo!
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
609.0 0.0 77910.0 0.0 0.0 0.8 0.0 1.4 0 83 c0d0
only 1-2 % slower then dd from /dev/dsk. Do you think this is general
32-bit problem, or specific to this combination of hardware? I am
using PCI/SATA Sil3114 card, and other then ZFS, performance of this
interface has some limitations in Solaris. That is, single drive gives
80MB/s, but doing dd /dev/dsk/xyz simultaneously on 2 drives attached
to the card gives only 46MB/s each. On Linux, however, that gives
60MB/s each, close to saturating theoretical throughput of PCI bus.
Having both drives in zpool stripe gives, with prefetch disabled,
close to 45MB/s each through dd from zfs file.
Matthew Ahrens
2007-05-16 16:29:30 UTC
Permalink
Post by Marko Milisavljevic
Got excited too quickly on one thing... reading single zfs file does
give me almost same speed as dd /dev/dsk... around 78MB/s... however,
Yes, that makes sense. Because prefetch is disabled, ZFS will only
issue one read i/o at a time (for that stream). This is one of the
reasons prefetch is important :-)

Eg, in your output below you can see that each disk is only busy 40% of
Post by Marko Milisavljevic
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
294.3 0.0 37675.6 0.0 0.0 0.4 0.0 1.4 0 40 c3d0
293.0 0.0 37504.9 0.0 0.0 0.4 0.0 1.4 0 40 c3d1
Simultaneous dd on those 2 drives from /dev/dsk runs at 46MB/s per drive.
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
800.4 0.0 44824.6 0.0 0.0 1.8 0.0 2.2 0 99 c3d0
792.1 0.0 44357.9 0.0 0.0 1.8 0.0 2.2 0 98 c3d1
--matt
Matthew Ahrens
2007-05-16 16:32:35 UTC
Permalink
Post by Matthew Ahrens
set zfs:zfs_prefetch_disable=1
bingo!
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
609.0 0.0 77910.0 0.0 0.0 0.8 0.0 1.4 0 83 c0d0
only 1-2 % slower then dd from /dev/dsk. Do you think this is general
32-bit problem, or specific to this combination of hardware?
I suspect that it's fairly generic, but more analysis will be necessary.
Post by Matthew Ahrens
Finally, should I file a bug somewhere regarding prefetch, or is this
a known issue?
It may be related to 6469558, but yes please do file another bug report.
I'll have someone on the ZFS team take a look at it.

--matt
Marko Milisavljevic
2007-05-16 17:06:33 UTC
Permalink
I will do that, but I'll do a couple of things first, to try to isolate the
problem more precisely:

- Use ZFS on a plain PATA drive on onboard IDE connector to see if it works
with prefetch on this 32-bit machine.
- Use this PCI-SATA card in a 64-bit, 2g RAM machine and see how it performs
there, and also compare it to that machine's onboard ICH7 SATA interface (I
assume I can force it to use AHCI drivers or not by changing the mode of
operation for ICH7 in BIOS).

Marko
Post by Matthew Ahrens
Post by Marko Milisavljevic
Finally, should I file a bug somewhere regarding prefetch, or is this
a known issue?
It may be related to 6469558, but yes please do file another bug report.
I'll have someone on the ZFS team take a look at it.
--matt
j***@sun.com
2007-05-16 18:38:24 UTC
Permalink
At Matt's request, I did some further experiments and have found that
this appears to be particular to your hardware. This is not a general
32-bit problem. I re-ran this experiment on a 1-disk pool using a 32
and 64-bit kernel. I got identical results:

64-bit
======

$ /usr/bin/time dd if=/testpool1/filebench/testfile of=/dev/null bs=128k
count=10000
10000+0 records in
10000+0 records out

real 20.1
user 0.0
sys 1.2

62 Mb/s

# /usr/bin/time dd if=/dev/dsk/c1t3d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 19.0
user 0.0
sys 2.6

65 Mb/s

32-bit
======

/usr/bin/time dd if=/testpool1/filebench/testfile of=/dev/null bs=128k
count=10000
10000+0 records in
10000+0 records out

real 20.1
user 0.0
sys 1.7

62 Mb/s

# /usr/bin/time dd if=/dev/dsk/c1t3d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out

real 19.1
user 0.0
sys 4.3

65 Mb/s

-j
Post by Matthew Ahrens
Post by Matthew Ahrens
set zfs:zfs_prefetch_disable=1
bingo!
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
609.0 0.0 77910.0 0.0 0.0 0.8 0.0 1.4 0 83 c0d0
only 1-2 % slower then dd from /dev/dsk. Do you think this is general
32-bit problem, or specific to this combination of hardware?
I suspect that it's fairly generic, but more analysis will be necessary.
Post by Matthew Ahrens
Finally, should I file a bug somewhere regarding prefetch, or is this
a known issue?
It may be related to 6469558, but yes please do file another bug report.
I'll have someone on the ZFS team take a look at it.
--matt
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
j***@sun.com
2007-05-16 20:18:05 UTC
Permalink
Marko,
Matt and I discussed this offline some more and he had a couple of ideas
about double-checking your hardware.

It looks like your controller (or disks, maybe?) is having trouble with
multiple simultaneous I/Os to the same disk. It looks like prefetch
aggravates this problem.

When I asked Matt what we could do to verify that it's the number of
concurrent I/Os that is causing performance to be poor, he had the
following suggestions:

set zfs_vdev_{min,max}_pending=1 and run with prefetch on, then
iostat should show 1 outstanding io and perf should be good.

or turn prefetch off, and have multiple threads reading
concurrently, then iostat should show multiple outstanding ios
and perf should be bad.

Let me know if you have any additional questions.

-j
Post by j***@sun.com
At Matt's request, I did some further experiments and have found that
this appears to be particular to your hardware. This is not a general
32-bit problem. I re-ran this experiment on a 1-disk pool using a 32
64-bit
======
$ /usr/bin/time dd if=/testpool1/filebench/testfile of=/dev/null bs=128k
count=10000
10000+0 records in
10000+0 records out
real 20.1
user 0.0
sys 1.2
62 Mb/s
# /usr/bin/time dd if=/dev/dsk/c1t3d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 19.0
user 0.0
sys 2.6
65 Mb/s
32-bit
======
/usr/bin/time dd if=/testpool1/filebench/testfile of=/dev/null bs=128k
count=10000
10000+0 records in
10000+0 records out
real 20.1
user 0.0
sys 1.7
62 Mb/s
# /usr/bin/time dd if=/dev/dsk/c1t3d0 of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
real 19.1
user 0.0
sys 4.3
65 Mb/s
-j
Post by Matthew Ahrens
Post by Matthew Ahrens
set zfs:zfs_prefetch_disable=1
bingo!
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
609.0 0.0 77910.0 0.0 0.0 0.8 0.0 1.4 0 83 c0d0
only 1-2 % slower then dd from /dev/dsk. Do you think this is general
32-bit problem, or specific to this combination of hardware?
I suspect that it's fairly generic, but more analysis will be necessary.
Post by Matthew Ahrens
Finally, should I file a bug somewhere regarding prefetch, or is this
a known issue?
It may be related to 6469558, but yes please do file another bug report.
I'll have someone on the ZFS team take a look at it.
--matt
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Marko Milisavljevic
2007-05-17 06:58:59 UTC
Permalink
Thank you, following your suggestion improves things - reading a ZFS
file from a RAID-0 pair now gives me 95MB/sec - about the same as from
/dev/dsk. What I find surprising is that reading from RAID-1 2-drive
zpool gives me only 56MB/s - I imagined it would be roughly like
reading from RAID-0. I can see that it can't be identical - when
reading mirrored drives simultaneously, some data will need to be
skipped if the file is laid out sequentially, but it doesn't seem
intuitively obvious how my broken drvers/card would affect it to that
degree, especially since reading from a file from one-disk zpool gives
me 70MB/s. My plan was to make 4-disk RAID-Z - we'll see how it works
out when all drives arrive.

Given how common Sil3114 chipset is in
my-old-computer-became-home-server segment, I am sure this workaround
will be appreciated by many who google their way here. And just in
case it is not clear, what j means below is to add these two lines in
/etc/system:

set zfs:zfs_vdev_min_pending=1
set zfs:zfs_vdev_max_pending=1

I've been doing a lot of reading, and it seem unlikely that any effort
will be made to address the driver performance with either ATA or
Sil311x chipset specifically - by the time more pressing enhancements
are made with various SATA drivers, this will be too obsolete to
matter.

With your workaround things are working well enough for the purpose
that I am able to chose Solaris over Linux - thanks again.

Marko
Post by j***@sun.com
Marko,
Matt and I discussed this offline some more and he had a couple of ideas
about double-checking your hardware.
It looks like your controller (or disks, maybe?) is having trouble with
multiple simultaneous I/Os to the same disk. It looks like prefetch
aggravates this problem.
When I asked Matt what we could do to verify that it's the number of
concurrent I/Os that is causing performance to be poor, he had the
set zfs_vdev_{min,max}_pending=1 and run with prefetch on, then
iostat should show 1 outstanding io and perf should be good.
or turn prefetch off, and have multiple threads reading
concurrently, then iostat should show multiple outstanding ios
and perf should be bad.
Let me know if you have any additional questions.
-j
Trygve Laugstøl
2007-05-20 10:48:54 UTC
Permalink
Post by Marko Milisavljevic
Thank you, following your suggestion improves things - reading a ZFS
file from a RAID-0 pair now gives me 95MB/sec - about the same as from
/dev/dsk. What I find surprising is that reading from RAID-1 2-drive
zpool gives me only 56MB/s - I imagined it would be roughly like
reading from RAID-0. I can see that it can't be identical - when
reading mirrored drives simultaneously, some data will need to be
skipped if the file is laid out sequentially, but it doesn't seem
intuitively obvious how my broken drvers/card would affect it to that
degree, especially since reading from a file from one-disk zpool gives
me 70MB/s. My plan was to make 4-disk RAID-Z - we'll see how it works
out when all drives arrive.
Given how common Sil3114 chipset is in
my-old-computer-became-home-server segment, I am sure this workaround
will be appreciated by many who google their way here. And just in
case it is not clear, what j means below is to add these two lines in
set zfs:zfs_vdev_min_pending=1
set zfs:zfs_vdev_max_pending=1
I just tried the same myself but got these warnins when booting:

May 20 01:22:29 deservio genunix: [ID 492708 kern.notice] sorry,
variable 'zfs_vdev_min_pending' is not defined in the 'zfs'
May 20 01:22:29 deservio genunix: [ID 966847 kern.notice] module
May 20 01:22:29 deservio genunix: [ID 100000 kern.notice]
May 20 01:22:29 deservio genunix: [ID 492708 kern.notice] sorry,
variable 'zfs_vdev_max_pending' is not defined in the 'zfs'
May 20 01:22:29 deservio genunix: [ID 966847 kern.notice] module
May 20 01:22:29 deservio genunix: [ID 100000 kern.notice]

I'm running b60.
Marko Milisavljevic
2007-05-21 05:42:19 UTC
Permalink
It is definitely defined in b63... not sure when it got introduced.

http://src.opensolaris.org/source/xref/onnv/aside/usr/src/cmd/mdb/common/modules/zfs/zfs.c

shows tunable parameters for ZFS, under "zfs_params(...)"
Post by Trygve Laugstøl
Post by Marko Milisavljevic
Given how common Sil3114 chipset is in
my-old-computer-became-home-server segment, I am sure this workaround
will be appreciated by many who google their way here. And just in
case it is not clear, what j means below is to add these two lines in
set zfs:zfs_vdev_min_pending=1
set zfs:zfs_vdev_max_pending=1
May 20 01:22:29 deservio genunix: [ID 492708 kern.notice] sorry,
variable 'zfs_vdev_min_pending' is not defined in the 'zfs'
May 20 01:22:29 deservio genunix: [ID 966847 kern.notice] module
May 20 01:22:29 deservio genunix: [ID 100000 kern.notice]
May 20 01:22:29 deservio genunix: [ID 492708 kern.notice] sorry,
variable 'zfs_vdev_max_pending' is not defined in the 'zfs'
May 20 01:22:29 deservio genunix: [ID 966847 kern.notice] module
May 20 01:22:29 deservio genunix: [ID 100000 kern.notice]
I'm running b60.
Continue reading on narkive:
Loading...