Lutz Schumann
2010-07-01 08:33:41 UTC
Hello list,
I wanted to test deduplication a little and did a experiment.
My question was: can I dedupe infinite or is ther a upper limit ?
So for that I did a very basic test.
- I created a ramdisk-pool (1GB)
- enabled dedup and
- wrote zeros to it (in one single file) until an error is returned.
The size of the pool was 1046 MB, I was able to write 62 GB to it then it says "no space left on device". The block size was 128k, so I was able to write 507.000 blocks to the pool.
With this device beeing full, I see the following:
1) zfs list reports that no space is left (AVAIL=0)
2) zpool reports that the dedup factor was ~507.000x
3) zpool reports also that 8,6 MB of space were allocated in the pool (0% used)
So for me it looks like there is something broken in ZFS accounting with dedupe.
- zpool and zfs usage free space reporting do not align
- the real deduplication factor was not 507.000 (meaning I would have been able to write 507.000x1GB = a lot to the pool)
- when calculating 1046 MB / 507000 = 2.1 KB, somehow for each block of 128k, 2,1 KB of data bas been written (assuming zfs list is correct). What is this ? Metadata ? Meaning that I have aprox 1.6 % of Meatadata in ZFS (1/(128k/2,1k)) ?
I repeatet the same thing for a recordsize of 32k. The funny thing is:
- Also 60 GB could be written before "no space left"
- 31 MB of space were alloated in the pool (zpool list)
The version of the pool is 25.
During the experiment I could nicely see:
- that performance on ramdisk is CPU bound doing ~125 MB /sec per Core.
- performance scales linearly with adding CPU cores. (125 MB/s cor 1core, 253 Mb/s for 2core, 408 MB/s for 4core).
- that the upper size of the deduplication table is blocks * ~150 Byte, indipendent of the dedupe factor
- the ddt does not grow for deduplicatable blocks (zdb -D)
- performance goes down factor of ~4 when switching from allocation policy of "closest" to "best fit" (when the pool fills rate drops from 250 MB/s to 67 MB/s. I suspect even worse results for spinning media because of the head movements (>10x slow down).
Anyone knowing why the dedup factor is wrong ? Any insights on what has actually been written (compressed meta data, deduped meta data .. etc.) would be greatly appreshiated.
Regards,
Robert
I wanted to test deduplication a little and did a experiment.
My question was: can I dedupe infinite or is ther a upper limit ?
So for that I did a very basic test.
- I created a ramdisk-pool (1GB)
- enabled dedup and
- wrote zeros to it (in one single file) until an error is returned.
The size of the pool was 1046 MB, I was able to write 62 GB to it then it says "no space left on device". The block size was 128k, so I was able to write 507.000 blocks to the pool.
With this device beeing full, I see the following:
1) zfs list reports that no space is left (AVAIL=0)
2) zpool reports that the dedup factor was ~507.000x
3) zpool reports also that 8,6 MB of space were allocated in the pool (0% used)
So for me it looks like there is something broken in ZFS accounting with dedupe.
- zpool and zfs usage free space reporting do not align
- the real deduplication factor was not 507.000 (meaning I would have been able to write 507.000x1GB = a lot to the pool)
- when calculating 1046 MB / 507000 = 2.1 KB, somehow for each block of 128k, 2,1 KB of data bas been written (assuming zfs list is correct). What is this ? Metadata ? Meaning that I have aprox 1.6 % of Meatadata in ZFS (1/(128k/2,1k)) ?
I repeatet the same thing for a recordsize of 32k. The funny thing is:
- Also 60 GB could be written before "no space left"
- 31 MB of space were alloated in the pool (zpool list)
The version of the pool is 25.
During the experiment I could nicely see:
- that performance on ramdisk is CPU bound doing ~125 MB /sec per Core.
- performance scales linearly with adding CPU cores. (125 MB/s cor 1core, 253 Mb/s for 2core, 408 MB/s for 4core).
- that the upper size of the deduplication table is blocks * ~150 Byte, indipendent of the dedupe factor
- the ddt does not grow for deduplicatable blocks (zdb -D)
- performance goes down factor of ~4 when switching from allocation policy of "closest" to "best fit" (when the pool fills rate drops from 250 MB/s to 67 MB/s. I suspect even worse results for spinning media because of the head movements (>10x slow down).
Anyone knowing why the dedup factor is wrong ? Any insights on what has actually been written (compressed meta data, deduped meta data .. etc.) would be greatly appreshiated.
Regards,
Robert
--
This message posted from opensolaris.org
This message posted from opensolaris.org