Discussion:
Oracle on ZFS vs. UFS
Daniel Rock
2006-05-19 18:15:39 UTC
Permalink
Hi,

I'm preparing a personal TPC-H benchmark. The goal is not to measure or
optimize the database performance, but to compare ZFS to UFS in similar
configurations.

At the moment I'm preparing the tests at home. The test setup is as
follows:
. Solaris snv_37
. 2 x AMD Opteron 252
. 4 GB RAM
. 2 x 80 GB ST380817AS
. Oracle 10gR2 (small SGA (320m))

The disks also contain the OS image (mirrored via SVM). On the remaining
space I have created one zpool (one disk) resp. one MD-Volume with an UFS
filesystem ontop (the other disk)
Later I want to rerun the tests on an old E3500 (4x400MHz, 2GB RAM) with
two A5200 attached (~15 still alive 9GB disks each).

The first results at home are not very promising for ZFS.

I measured:
. database creation
. catalog integration (catalog + catproc)
. tablespace creation
. loading data into the database from dbgen with sqlldr

I can provide all the scripts (and precompiled binaries for qgen and dbgen
(SPARC + x86) if anyone wants to verify my tests.

In most of these tests UFS was considerable faster than ZFS. I tested
. ZFS with default options
. ZFS with compression enabled
. ZFS without checksums
. UFS (newfs: -f 8192 -i 2097152; tunefs: -e 6144; mount: nologging)


Below the (preliminary) results (with a 1GB dataset from dbgen), runtime
in minutes:seconds

UFS ZFS (default) ZFS+comp ZFS+nochksum
db creation 0:38 0:42 0:18 0:40
catalog 6:19 12:05 11:55 12:04
ts creation 0:13 0:14 0:04 0:16
data load[1] 8:49 26:20 25:39 26:19
index creation 0:48 0:38 0:31 0:36
key creation 1:55 1:31 1:18 1:25

[1] dbgen writes into named pipes, which are read back by sqlldr. So no
interim files are created

Esp. on catalog creation and loading data into the database UFS is by factor
2-3 faster than ZFS (regardless of ZFS options)

Only for read intensive tasks and for file creation if compression is enabled
ZFS is faster than UFS. This is to no surprise, since the machine has 4GB
RAM of which at least 3GB are unused, so ZFS has plenty of space for
caching (all datafiles together use just 2.8GB disk space). If I enlarge
the dataset I suspect that then also on the tests where ZFS does perform
better, UFS will again gain the lead.

I will now prepare the query benchmark to see how ZFS performs with a larger
amount of parallelism in the database. In order to test also read throughput
of ZFS vs. UFS, instead of using a larger dataset I will cut the memory the
OS uses by setting physmem to 1GB.


--
Daniel
Bart Smaalders
2006-05-19 18:46:15 UTC
Permalink
Post by Daniel Rock
Hi,
I'm preparing a personal TPC-H benchmark. The goal is not to measure or
optimize the database performance, but to compare ZFS to UFS in similar
configurations.
At the moment I'm preparing the tests at home. The test setup is as
. Solaris snv_37
. 2 x AMD Opteron 252
. 4 GB RAM
. 2 x 80 GB ST380817AS
. Oracle 10gR2 (small SGA (320m))
The disks also contain the OS image (mirrored via SVM). On the remaining
space I have created one zpool (one disk) resp. one MD-Volume with an UFS
filesystem ontop (the other disk)
Later I want to rerun the tests on an old E3500 (4x400MHz, 2GB RAM) with
two A5200 attached (~15 still alive 9GB disks each).
The first results at home are not very promising for ZFS.
. database creation
. catalog integration (catalog + catproc)
. tablespace creation
. loading data into the database from dbgen with sqlldr
I can provide all the scripts (and precompiled binaries for qgen and
dbgen (SPARC + x86) if anyone wants to verify my tests.
In most of these tests UFS was considerable faster than ZFS. I tested
. ZFS with default options
. ZFS with compression enabled
. ZFS without checksums
. UFS (newfs: -f 8192 -i 2097152; tunefs: -e 6144; mount: nologging)
Below the (preliminary) results (with a 1GB dataset from dbgen), runtime
in minutes:seconds
UFS ZFS (default) ZFS+comp
ZFS+nochksum
db creation 0:38 0:42 0:18 0:40
catalog 6:19 12:05 11:55 12:04
ts creation 0:13 0:14 0:04 0:16
data load[1] 8:49 26:20 25:39 26:19
index creation 0:48 0:38 0:31 0:36
key creation 1:55 1:31 1:18 1:25
[1] dbgen writes into named pipes, which are read back by sqlldr. So no
interim files are created
Esp. on catalog creation and loading data into the database UFS is by factor
2-3 faster than ZFS (regardless of ZFS options)
Only for read intensive tasks and for file creation if compression is enabled
ZFS is faster than UFS. This is to no surprise, since the machine has 4GB
RAM of which at least 3GB are unused, so ZFS has plenty of space for
caching (all datafiles together use just 2.8GB disk space). If I enlarge
the dataset I suspect that then also on the tests where ZFS does perform
better, UFS will again gain the lead.
I will now prepare the query benchmark to see how ZFS performs with a larger
amount of parallelism in the database. In order to test also read
throughput of ZFS vs. UFS, instead of using a larger dataset I will cut
the memory the OS uses by setting physmem to 1GB.
How big is the database?

Since oracle writes in small block sizes, did you set the recordsize for
ZFS?

From the zfs man page:

recordsize=size

Specifies a suggested block size for files in the file
system. This property is designed solely for use with
database workloads that access files in fixed-size
records. ZFS automatically tunes block sizes according
to internal algorithms optimized for typical access pat-
terns.

For databases that create very large files but access
them in small random chunks, these algorithms may be
suboptimal. Specifying a "recordsize" greater than or
equal to the record size of the database can result in
significant performance gains. Use of this property for
general purpose file systems is strongly discouraged,
and may adversely affect performance.

- Bart



Bart Smaalders Solaris Kernel Performance
***@cyber.eng.sun.com http://blogs.sun.com/barts
Daniel Rock
2006-05-19 21:09:38 UTC
Permalink
Post by Bart Smaalders
How big is the database?
After all the data has been loaded, all datafiles together 2.8GB, SGA 320MB.
But I don't think size matters on this problem, since you can already see
during the catalog creation phase that UFS is 2x faster.
Post by Bart Smaalders
Since oracle writes in small block sizes, did you set the recordsize for
ZFS?
recordsize is default (128K). Oracle uses:
db_block_size=8192
db_file_multi_block_read_count=16

I tried with "db_block_size=32768" but the results got worse.

I have just rerun the first parts of my benchmark (database + catalog
creation) with different parameters.

The datafiles will be deleted before each run, so I assume if Oracle
recreates the files again they will already use the modified zfs parameters
(so I don't have to recreate the zpool/zfs).

Below the results (UFS again as the reference point):
UFS written as UFS(forcedirectio?,ufs:blocksize,oracle:db_block_size)
ZFS written as ZFS(zfs:compression,zfs:recordsize,oracle:db_block_size)

These results are now run with memory capping in effect (physmem=262144 (1GB))

db creation catalog creation
UFS(-,8K,8K) [default] 0:41.851 6:17.530
UFS(forcedirectio,8K,8K) 0:40.479 6:03.688
UFS(forcedirectio,8K,32K) 0:48.718 8:19.359

ZFS(off,128K,8K) [default] 0:52.427 13:28.081
ZFS(on,128K,8K) 0:50.791 14.27.919
ZFS(on,8K,8K) 0:42.611 13:34.464
ZFS(off,32K,32K) 1:40.038 15:35.177

(times in min:sec.msec)

So you will win a few percent, but still slower compared to UFS. UFS catalog
creation is already mostly CPU bound: During the ~6 minutes of catalog
creation time the corresponding oracle process consumes ~5:30 minutes of CPU
time. So for UFS there is little margin for improvement.


If you have Oracle installed you can easily check yourself. I have uploaded
my init.ora file and the DB creation script to
http://www.deadcafe.de/perf/
Just modify the variables
. ADMIN (location of the oracle admin files)
. DBFILES (ZFS or UFS where datafiles should be placed)
. and the paths in init.ora

Benchmark results will be in "db.bench" file.


BTW: Why is maxphys still only 56 kByte by default on x86? I have increased
maxphys to 8MB, but not much difference on the results:

db creation catalog creation
ZFS(off,128K,8K) (*) 0:53.250 13:32.369

(*) maxphys = 8388608


Daniel
Daniel Rock
2006-05-19 22:17:16 UTC
Permalink
Post by Daniel Rock
(*) maxphys = 8388608
Pedantically, because ZFS does 128kByte I/Os. Setting maxphys >
128kBytes won't make any difference.
I know, but with the default maxphys value of 56kByte on x86 a 128kByte
request will be split into three physical I/Os.


Daniel

Loading...