Discussion:
ZFS Distro Advice
(too old to reply)
Tiernan OToole
2013-02-25 14:57:51 UTC
Permalink
Raw Message
Good morning all.

My home NAS died over the weekend, and it leaves me with a lot of spare
drives (5 2Tb and 3 1Tb disks). I have a Dell Poweredge 2900 Server sitting
in the house, which has not been doing much over the last while (bought it
a few years back with the intent of using it as a storage box, since it has
8 Hot Swap drive bays) and i am now looking at building the NAS using ZFS...

But, now i am confused as to what OS to use... OpenIndiana? Nexenta?
FreeNAS/FreeBSD?

I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would like
something i can manage "easily" and something that works with the Dell...

Any recommendations? Any comparisons to each?

Thanks.
--
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie
Volker A. Brandt
2013-02-25 15:23:06 UTC
Permalink
Raw Message
Hi Tiernan!
Post by Tiernan OToole
But, now i am confused as to what OS to use... OpenIndiana? Nexenta?
FreeNAS/FreeBSD?
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would
like something i can manage "easily" and something that works with
the Dell...
I can recommend FreeNAS. It lives in a USB stick, thus leaving all
your 8 disk slots free. It can do all the things you have listed
above. It has a nice management GUI to bring it all together.
And it is free, so you can download it and see if it recognizes all
your hardware, especially the storage and network controllers.


Best regards -- Volker A. Brandt
--
------------------------------------------------------------------------
Volker A. Brandt Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: ***@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgröße: 46
Geschäftsführer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"
Suleyman Nazif Kutlu
2013-02-25 16:21:12 UTC
Permalink
Raw Message
I just started the same migration..

I tried to test OpenIndiana w/Napp-IT, Illumos w/Napp-IT, Debian k/FreeBSD
and Nexenta Community Edition. I tested iSCSI, SMB and NFS. I did not find
time to test FreeNAS. All tests performed as a VM guest on ESXi 5.0.

SMB is based on kernel on OpenIndiana, Illumos and Nexenta so performing
better. Debian k/FreeBSD needs Samba to provide SMB.


I found Nexenta as the most easy one to manage with its GUI.


My two cents will be such that if you do not want to dig into details from
CLI, Nexenta or FreeNas will be the best choice. Illumos, OpenIndiana and
Debian k/FreeBSD needs more deep dive on OS level..

---

SNK
Post by Volker A. Brandt
Hi Tiernan!
Post by Tiernan OToole
But, now i am confused as to what OS to use... OpenIndiana? Nexenta?
FreeNAS/FreeBSD?
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would
like something i can manage "easily" and something that works with
the Dell...
I can recommend FreeNAS. It lives in a USB stick, thus leaving all
your 8 disk slots free. It can do all the things you have listed
above. It has a nice management GUI to bring it all together.
And it is free, so you can download it and see if it recognizes all
your hardware, especially the storage and network controllers.
Best regards -- Volker A. Brandt
--
------------------------------------------------------------------------
Volker A. Brandt Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/
Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgröße: 46
Geschäftsführer: Rainer J.H. Brandt und Volker A. Brandt
"When logic and proportion have fallen sloppy dead"
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Tim Cook
2013-02-25 17:27:35 UTC
Permalink
Raw Message
Post by Tiernan OToole
Good morning all.
My home NAS died over the weekend, and it leaves me with a lot of spare
drives (5 2Tb and 3 1Tb disks). I have a Dell Poweredge 2900 Server sitting
in the house, which has not been doing much over the last while (bought it
a few years back with the intent of using it as a storage box, since it has
8 Hot Swap drive bays) and i am now looking at building the NAS using ZFS...
But, now i am confused as to what OS to use... OpenIndiana? Nexenta?
FreeNAS/FreeBSD?
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would like
something i can manage "easily" and something that works with the Dell...
Any recommendations? Any comparisons to each?
Thanks.
All of them should provide the basic functionality you're looking for.
None of them will provide SMB3 (at all) or AFP (without a third party
package).

--Tim
Volker A. Brandt
2013-02-25 17:57:09 UTC
Permalink
Raw Message
Post by Tim Cook
Post by Tiernan OToole
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would
like something i can manage "easily" and something that works with
the Dell...
All of them should provide the basic functionality you're looking for.
None of them will provide SMB3 (at all) or AFP (without a third
party package).
FreeNAS has AFP built-in, including a Time Machine discovery method.

The latest FreeNAS is still based on Samba 3.x, but they are aware
of 4.x and will probably integrate it at some point in the future.
Then you should have SMB3. I don't know how far along they are...


Best regards -- Volker
--
------------------------------------------------------------------------
Volker A. Brandt Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: ***@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgröße: 46
Geschäftsführer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"
Tim Cook
2013-02-25 18:11:31 UTC
Permalink
Raw Message
Post by Volker A. Brandt
Post by Tim Cook
Post by Tiernan OToole
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would
like something i can manage "easily" and something that works with
the Dell...
All of them should provide the basic functionality you're looking for.
None of them will provide SMB3 (at all) or AFP (without a third
party package).
FreeNAS has AFP built-in, including a Time Machine discovery method.
The latest FreeNAS is still based on Samba 3.x, but they are aware
of 4.x and will probably integrate it at some point in the future.
Then you should have SMB3. I don't know how far along they are...
Best regards -- Volker
FreeNAS comes with a package pre-installed to add AFP support. There is no
native AFP support in FreeBSD and by association FreeNAS.

--Tim
Tiernan OToole
2013-02-26 08:33:14 UTC
Permalink
Raw Message
Thanks all! I will check out FreeNAS and see what it can do... I will also
check my RAID Card and see if it can work with JBOD... fingers crossed...
The machine has a couple internal SATA ports (think there are 2, could be
4) so i was thinking of using those for boot disks and SSDs later...

As a follow up question: Data Deduplication: The machine, to start, will
have about 5Gb RAM. I read somewhere that 20TB storage would require about
8GB RAM, depending on block size... Since i dont know block sizes, yet (i
store a mix of VMs, TV Shows, Movies and backups on the NAS) I am not sure
how much memory i will need (my estimate is 10TB RAW (8TB usable?) in a
ZRAID1 pool, and then 3TB RAW in a striped pool). If i dont have enough
memory now, can i enable DeDupe at a later stage when i add memory? Also,
if i pick FreeBSD now, and want to move to, say, Nexenta, is that possible?
Assuming the drives are just JBOD drives (to be confirmed) could they just
get imported?

Thanks.
Post by Tim Cook
Post by Volker A. Brandt
Post by Tim Cook
Post by Tiernan OToole
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would
like something i can manage "easily" and something that works with
the Dell...
All of them should provide the basic functionality you're looking for.
None of them will provide SMB3 (at all) or AFP (without a third
party package).
FreeNAS has AFP built-in, including a Time Machine discovery method.
The latest FreeNAS is still based on Samba 3.x, but they are aware
of 4.x and will probably integrate it at some point in the future.
Then you should have SMB3. I don't know how far along they are...
Best regards -- Volker
FreeNAS comes with a package pre-installed to add AFP support. There is
no native AFP support in FreeBSD and by association FreeNAS.
--Tim
--
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie
Sašo Kiselkov
2013-02-26 08:43:44 UTC
Permalink
Raw Message
Post by Tiernan OToole
As a follow up question: Data Deduplication: The machine, to start, will
have about 5Gb RAM. I read somewhere that 20TB storage would require about
8GB RAM, depending on block size...
The typical wisdom is that 1TB of dedup'ed data = 1GB of RAM. 5GB of RAM
seems too small for a 20TB pool of dedup'ed data.
Unless you know what you're doing, I'd go with just compression and let
dedup be - compression has known performance and doesn't suffer with
scaling.
Post by Tiernan OToole
If i dont have enough memory now, can i enable DeDupe at a later stage
when i add memory?
Yes.
Post by Tiernan OToole
Also, if i pick FreeBSD now, and want to move to, say, Nexenta, is that
possible? Assuming the drives are just JBOD drives (to be confirmed)
could they just get imported?
Yes, that's the whole point of open storage.

I'd also recommend that you go and subscribe to ***@lists.illumos.org,
since this list is going to get shut down by Oracle next month.

Cheers,
--
Saso
Gary Driggs
2013-02-26 14:51:08 UTC
Permalink
Raw Message
On Feb 26, 2013, at 12:44 AM, "Sašo Kiselkov" wrote:

I'd also recommend that you go and subscribe to ***@lists.illumos.org, since
this list is going to get shut down by Oracle next month.


Whose description still reads, "everything ZFS running on illumos-based
distributions."

-Gary
Sašo Kiselkov
2013-02-26 15:56:50 UTC
Permalink
Raw Message
Post by Sašo Kiselkov
this list is going to get shut down by Oracle next month.
Whose description still reads, "everything ZFS running on illumos-based
distributions."
We've never dismissed any topic or issue as "not our problem". All
sensible ZFS-related discussion is welcome and taken seriously.

Cheers,
--
Saso
Eugen Leitl
2013-02-26 16:57:46 UTC
Permalink
Raw Message
I can't seem to find this list. Do you have an URL for that?
Mailman, hopefully?
Post by Sašo Kiselkov
this list is going to get shut down by Oracle next month.
Whose description still reads, "everything ZFS running on illumos-based
distributions."
-Gary
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
Sašo Kiselkov
2013-02-26 17:01:39 UTC
Permalink
Raw Message
Post by Eugen Leitl
I can't seem to find this list. Do you have an URL for that?
Mailman, hopefully?
http://wiki.illumos.org/display/illumos/illumos+Mailing+Lists

--
Saso
Eugen Leitl
2013-02-26 17:10:24 UTC
Permalink
Raw Message
Post by Sašo Kiselkov
Post by Eugen Leitl
I can't seem to find this list. Do you have an URL for that?
Mailman, hopefully?
http://wiki.illumos.org/display/illumos/illumos+Mailing+Lists
Oh, it's the illumos-zfs one. Had me confused.
Bob Friesenhahn
2013-02-27 03:22:32 UTC
Permalink
Raw Message
Post by Sašo Kiselkov
down by Oracle next month.
Whose description still reads, "everything ZFS running on illumos-based distributions."
Even FreeBSD's zfs is now based on zfs from Illumos. FreeBSD and
Linux zfs developers contribute fixes back to zfs in Illumos.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Ahmed Kamal
2013-02-27 11:32:03 UTC
Permalink
Raw Message
How is the quality of the ZFS Linux port today? Is it comparable to Illumos
or at least FreeBSD ? Can I trust production data to it ?


On Wed, Feb 27, 2013 at 5:22 AM, Bob Friesenhahn <
Post by Sašo Kiselkov
I'd also recommend that you go and subscribe to
down by Oracle next month.
Whose description still reads, "everything ZFS running on illumos-based distributions."
Even FreeBSD's zfs is now based on zfs from Illumos. FreeBSD and Linux
zfs developers contribute fixes back to zfs in Illumos.
Bob
--
Bob Friesenhahn
users/bfriesen/ <http://www.simplesystems.org/users/bfriesen/>
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Sašo Kiselkov
2013-02-27 11:37:26 UTC
Permalink
Raw Message
Post by Ahmed Kamal
How is the quality of the ZFS Linux port today? Is it comparable to Illumos
or at least FreeBSD ? Can I trust production data to it ?
Can't speak from personal experience, but a colleague of mine has been
PPA builds on Ubuntu and has had, well, less than stellar experience. It
shows promise, but I'm not sure it's there yet.

Cheers,
--
Saso
Dan Swartzendruber
2013-02-27 12:57:35 UTC
Permalink
Raw Message
I've been using it since rc13. It's been stable for me as long as you don't
get into things like zvols and such...

-----Original Message-----
From: zfs-discuss-***@opensolaris.org
[mailto:zfs-discuss-***@opensolaris.org] On Behalf Of Sašo Kiselkov
Sent: Wednesday, February 27, 2013 6:37 AM
To: zfs-***@opensolaris.org
Subject: Re: [zfs-discuss] ZFS Distro Advice
Post by Ahmed Kamal
How is the quality of the ZFS Linux port today? Is it comparable to
Illumos or at least FreeBSD ? Can I trust production data to it ?
Can't speak from personal experience, but a colleague of mine has been PPA
builds on Ubuntu and has had, well, less than stellar experience. It shows
promise, but I'm not sure it's there yet.

Cheers,
--
Saso
Tim Cook
2013-02-27 19:05:30 UTC
Permalink
Raw Message
Post by Dan Swartzendruber
I've been using it since rc13. It's been stable for me as long as you don't
get into things like zvols and such...
Then it definitely isn't at the level of FreeBSD, and personally I would
not consider that production ready.


--Tim
Dan Swartzendruber
2013-02-27 19:16:59 UTC
Permalink
Raw Message
On Wed, Feb 27, 2013 at 2:57 AM, Dan Swartzendruber
I've been using it since rc13. It's been stable for me as long as you don't
get into things like zvols and such...
Then it definitely isn't at the level of FreeBSD, and personally I
would not consider that production ready.
Everyone has to make their own risk assessment. Keep in mind, it is
described as a release candidate. I understand zvols are an important
feature, but I can do without them, so I am...
Tim Cook
2013-02-26 08:44:34 UTC
Permalink
Raw Message
Post by Tiernan OToole
Thanks all! I will check out FreeNAS and see what it can do... I will also
check my RAID Card and see if it can work with JBOD... fingers crossed...
The machine has a couple internal SATA ports (think there are 2, could be
4) so i was thinking of using those for boot disks and SSDs later...
As a follow up question: Data Deduplication: The machine, to start, will
have about 5Gb RAM. I read somewhere that 20TB storage would require about
8GB RAM, depending on block size... Since i dont know block sizes, yet (i
store a mix of VMs, TV Shows, Movies and backups on the NAS) I am not sure
how much memory i will need (my estimate is 10TB RAW (8TB usable?) in a
ZRAID1 pool, and then 3TB RAW in a striped pool). If i dont have enough
memory now, can i enable DeDupe at a later stage when i add memory? Also,
if i pick FreeBSD now, and want to move to, say, Nexenta, is that possible?
Assuming the drives are just JBOD drives (to be confirmed) could they just
get imported?
Thanks.
Yes, you can move between FreeBSD and Illumos based distros as long as you
are at a compatible zpool version (which they currently are). I'd avoid
deduplication unless you absolutely need it... it's still a bit of a
kludge. Stick to compression and your world will be a much happier place.

--Tim
Tiernan OToole
2013-02-26 09:11:19 UTC
Permalink
Raw Message
Thanks again lads. I will take all that info into advice, and will join
that new group also!

Thanks again!

--Tiernan
Post by Tim Cook
Post by Tiernan OToole
Thanks all! I will check out FreeNAS and see what it can do... I will
also check my RAID Card and see if it can work with JBOD... fingers
crossed... The machine has a couple internal SATA ports (think there are 2,
could be 4) so i was thinking of using those for boot disks and SSDs
later...
As a follow up question: Data Deduplication: The machine, to start, will
have about 5Gb RAM. I read somewhere that 20TB storage would require about
8GB RAM, depending on block size... Since i dont know block sizes, yet (i
store a mix of VMs, TV Shows, Movies and backups on the NAS) I am not sure
how much memory i will need (my estimate is 10TB RAW (8TB usable?) in a
ZRAID1 pool, and then 3TB RAW in a striped pool). If i dont have enough
memory now, can i enable DeDupe at a later stage when i add memory? Also,
if i pick FreeBSD now, and want to move to, say, Nexenta, is that possible?
Assuming the drives are just JBOD drives (to be confirmed) could they just
get imported?
Thanks.
Yes, you can move between FreeBSD and Illumos based distros as long as you
are at a compatible zpool version (which they currently are). I'd avoid
deduplication unless you absolutely need it... it's still a bit of a
kludge. Stick to compression and your world will be a much happier place.
--Tim
--
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie
Richard Elling
2013-02-26 17:57:05 UTC
Permalink
Raw Message
Thanks all! I will check out FreeNAS and see what it can do... I will also check my RAID Card and see if it can work with JBOD... fingers crossed... The machine has a couple internal SATA ports (think there are 2, could be 4) so i was thinking of using those for boot disks and SSDs later...
As a follow up question: Data Deduplication: The machine, to start, will have about 5Gb RAM. I read somewhere that 20TB storage would require about 8GB RAM, depending on block size... Since i dont know block sizes, yet (i store a mix of VMs, TV Shows, Movies and backups on the NAS)
Consider using different policies for different data. For traditional file systems, you
had relatively few policy options: readonly, nosuid, quota, etc. With ZFS, dedup and
compression are also policy options. In your case, dedup for your media is not likely
to be a good policy, but dedup for your backups could be a win (unless you're using
something that already doesn't backup duplicate data -- eg most backup utilities).
A way to approach this is to think of your directory structure and create file systems
to match the policies. For example:
/home/richard = compressed (default top-level, since properties are inherited)
/home/richard/media = compressed
/home/richard/backup = compressed + dedup

-- richard
I am not sure how much memory i will need (my estimate is 10TB RAW (8TB usable?) in a ZRAID1 pool, and then 3TB RAW in a striped pool). If i dont have enough memory now, can i enable DeDupe at a later stage when i add memory? Also, if i pick FreeBSD now, and want to move to, say, Nexenta, is that possible? Assuming the drives are just JBOD drives (to be confirmed) could they just get imported?
Thanks.
Post by Tim Cook
Post by Tiernan OToole
I need something that will allow me to share files over SMB (3 if
possible), NFS, AFP (for Time Machine) and iSCSI. Ideally, i would
like something i can manage "easily" and something that works with
the Dell...
All of them should provide the basic functionality you're looking for.
None of them will provide SMB3 (at all) or AFP (without a third
party package).
FreeNAS has AFP built-in, including a Time Machine discovery method.
The latest FreeNAS is still based on Samba 3.x, but they are aware
of 4.x and will probably integrate it at some point in the future.
Then you should have SMB3. I don't know how far along they are...
Best regards -- Volker
FreeNAS comes with a package pre-installed to add AFP support. There is no native AFP support in FreeBSD and by association FreeNAS.
--Tim
--
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--

***@RichardElling.com
+1-760-896-4422
Bob Friesenhahn
2013-02-27 03:27:42 UTC
Permalink
Raw Message
Post by Richard Elling
Consider using different policies for different data. For traditional file systems, you
had relatively few policy options: readonly, nosuid, quota, etc. With ZFS, dedup and
compression are also policy options. In your case, dedup for your media is not likely
to be a good policy, but dedup for your backups could be a win (unless you're using
something that already doesn't backup duplicate data -- eg most backup utilities).
A way to approach this is to think of your directory structure and create file systems
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Ian Collins
2013-02-27 03:34:21 UTC
Permalink
Raw Message
Post by Bob Friesenhahn
Post by Richard Elling
Consider using different policies for different data. For traditional file systems, you
had relatively few policy options: readonly, nosuid, quota, etc. With ZFS, dedup and
compression are also policy options. In your case, dedup for your media is not likely
to be a good policy, but dedup for your backups could be a win (unless you're using
something that already doesn't backup duplicate data -- eg most backup utilities).
A way to approach this is to think of your directory structure and create file systems
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.
We do the same for all of our "legacy" operating system backups. Take a
snapshot then do an rsync and an excellent way of maintaining
incremental backups for those.
--
Ian.
Bob Friesenhahn
2013-02-27 03:42:13 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Bob Friesenhahn
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.
We do the same for all of our "legacy" operating system backups. Take a
snapshot then do an rsync and an excellent way of maintaining incremental
backups for those.
Magic rsync options used:

-a --inplace --no-whole-file --delete-excluded

This causes rsync to overwrite the file blocks in place rather than
writing to a new temporary file first. As a result, zfs COW produces
primitive "deduplication" of at least the unchanged blocks (by writing
nothing) while writing new COW blocks for the changed blocks.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Ian Collins
2013-02-27 04:36:10 UTC
Permalink
Raw Message
Post by Bob Friesenhahn
Post by Ian Collins
Post by Bob Friesenhahn
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.
We do the same for all of our "legacy" operating system backups. Take a
snapshot then do an rsync and an excellent way of maintaining incremental
backups for those.
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than
writing to a new temporary file first. As a result, zfs COW produces
primitive "deduplication" of at least the unchanged blocks (by writing
nothing) while writing new COW blocks for the changed blocks.
Do these options impact performance or reduce the incremental stream sizes?

I just use -a --delete and the snapshots don't take up much space
(compared with the incremental stream sizes).
--
Ian.
Jim Klimov
2013-02-27 05:08:47 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Bob Friesenhahn
Post by Ian Collins
Post by Bob Friesenhahn
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.
We do the same for all of our "legacy" operating system backups. Take a
snapshot then do an rsync and an excellent way of maintaining incremental
backups for those.
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than
writing to a new temporary file first. As a result, zfs COW produces
primitive "deduplication" of at least the unchanged blocks (by writing
nothing) while writing new COW blocks for the changed blocks.
Do these options impact performance or reduce the incremental stream sizes?
I just use -a --delete and the snapshots don't take up much space
(compared with the incremental stream sizes).
Well, to be certain, you can create a dataset with a large file in it,
snapshot it, and rsync over a changed variant of the file, snapshot and
compare referenced sizes. If the file was rewritten into a new temporary
one and then renamed over original, you'd likely end up with as much
used storage as for the original file. If only changes are written into
it "in-place" then you'd use a lot less space (and you'd not see a
.garbledfilename in the directory during the process).

If you use rsync over network to back up stuff, here's an example of
SMF wrapper for rsyncd, and a config sample to make a snapshot after
completion of the rsync session.

http://wiki.openindiana.org/oi/rsync+daemon+service+on+OpenIndiana

HTH,
//Jim Klimov
Bob Friesenhahn
2013-02-27 14:28:01 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Bob Friesenhahn
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than
writing to a new temporary file first. As a result, zfs COW produces
primitive "deduplication" of at least the unchanged blocks (by writing
nothing) while writing new COW blocks for the changed blocks.
Do these options impact performance or reduce the incremental stream sizes?
I don't see any adverse impact on performance and incremental stream
size is quite considerably reduced.

The main risk is that if the disk fills up you may end up with a
corrupted file rather than just an rsync error. However, the
snapshots help because an earlier version of the file is likely
available.
Post by Ian Collins
I just use -a --delete and the snapshots don't take up much space (compared
with the incremental stream sizes).
That is what I used to do before I learned better.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Matthew Ahrens
2013-03-05 00:58:27 UTC
Permalink
Raw Message
On Tue, Feb 26, 2013 at 7:42 PM, Bob Friesenhahn
Post by Bob Friesenhahn
Post by Ian Collins
Post by Bob Friesenhahn
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.
We do the same for all of our "legacy" operating system backups. Take a
snapshot then do an rsync and an excellent way of maintaining incremental
backups for those.
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than writing
to a new temporary file first. As a result, zfs COW produces primitive
"deduplication" of at least the unchanged blocks (by writing nothing) while
writing new COW blocks for the changed blocks.
If I understand your use case correctly (the application overwrites
some blocks with the same exact contents), ZFS will ignore these
"no-op" writes only on recent Open ZFS (illumos / FreeBSD / Linux)
builds with checksum=sha256 and compression!=off. AFAIK, Solaris ZFS
will COW the blocks even if their content is identical to what's
already there, causing the snapshots to diverge.

See https://www.illumos.org/issues/3236 for details.

commit 80901aea8e78a2c20751f61f01bebd1d5b5c2ba5
Author: George Wilson <***@delphix.com>
Date: Tue Nov 13 14:55:48 2012 -0800

3236 zio nop-write

--matt
Robert Milkowski
2013-03-05 09:10:06 UTC
Permalink
Raw Message
Post by Bob Friesenhahn
Post by Bob Friesenhahn
Post by Ian Collins
We do the same for all of our "legacy" operating system backups.
Take
Post by Bob Friesenhahn
Post by Ian Collins
a snapshot then do an rsync and an excellent way of maintaining
incremental backups for those.
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than
writing to a new temporary file first. As a result, zfs COW produces
primitive "deduplication" of at least the unchanged blocks (by
writing
Post by Bob Friesenhahn
nothing) while writing new COW blocks for the changed blocks.
If I understand your use case correctly (the application overwrites
some blocks with the same exact contents), ZFS will ignore these "no-
I think he meant to rely on rsync here to do in-place updates of files and
only for changed blocks with the above parameters (by using rsync's own
delta mechanism). So if you have a file a and only one block changed rsync
will overwrite on destination only that single block.
Post by Bob Friesenhahn
op" writes only on recent Open ZFS (illumos / FreeBSD / Linux) builds
with checksum=sha256 and compression!=off. AFAIK, Solaris ZFS will COW
the blocks even if their content is identical to what's already there,
causing the snapshots to diverge.
See https://www.illumos.org/issues/3236 for details.
This is interesting. I didn't know about it.
Is there an option similar to verify=on in dedup or does it just assume that
"checksum is your data"?
--
Robert Milkowski
http://milek.blogspot.com
Bob Friesenhahn
2013-03-05 15:02:00 UTC
Permalink
Raw Message
Post by Matthew Ahrens
Post by Bob Friesenhahn
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than writing
to a new temporary file first. As a result, zfs COW produces primitive
"deduplication" of at least the unchanged blocks (by writing nothing) while
writing new COW blocks for the changed blocks.
If I understand your use case correctly (the application overwrites
some blocks with the same exact contents), ZFS will ignore these
"no-op" writes only on recent Open ZFS (illumos / FreeBSD / Linux)
builds with checksum=sha256 and compression!=off. AFAIK, Solaris ZFS
will COW the blocks even if their content is identical to what's
already there, causing the snapshots to diverge.
With these rsync options, rsync will only overwrite a "block" if the
contents of the block has changed. Rsync's notion of a block is
different than zfs so there is not a perfect overlap.

Rsync does need to read files on the destination filesystem to see if
they have changed. If the system has sufficient RAM (and/or L2ARC)
then files may still be cached from the previous day's run. In most
cases only a small subset of the total files are updated (at least on
my systems) so the caching requirements are small. Files updated on
one day are more likely to be the ones updated on subsequent days.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
David Magda
2013-03-05 15:40:41 UTC
Permalink
Raw Message
Post by Bob Friesenhahn
Rsync does need to read files on the destination filesystem to see if
they have changed. If the system has sufficient RAM (and/or L2ARC)
then files may still be cached from the previous day's run. In most
cases only a small subset of the total files are updated (at least on
my systems) so the caching requirements are small. Files updated on
one day are more likely to be the ones updated on subsequent days.
It's also possible to reduce the amount that rsync has to walk the entire
file tree.

Most folks simply do a "rsync --options /my/source/ /the/dest/", but if
you use "zfs diff", and parse/feed the output of that to rsync, then the
amount of thrashing can probably be minimized. Especially useful for file
hierarchies that very many individual files, so you don't have to stat()
every single one.
Russ Poyner
2013-03-05 16:17:25 UTC
Permalink
Raw Message
Post by David Magda
Post by Bob Friesenhahn
Rsync does need to read files on the destination filesystem to see if
they have changed. If the system has sufficient RAM (and/or L2ARC)
then files may still be cached from the previous day's run. In most
cases only a small subset of the total files are updated (at least on
my systems) so the caching requirements are small. Files updated on
one day are more likely to be the ones updated on subsequent days.
It's also possible to reduce the amount that rsync has to walk the entire
file tree.
Most folks simply do a "rsync --options /my/source/ /the/dest/", but if
you use "zfs diff", and parse/feed the output of that to rsync, then the
amount of thrashing can probably be minimized. Especially useful for file
hierarchies that very many individual files, so you don't have to stat()
every single one.
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
David,

Your idea to use zfs diff to limit the need to stat the entire
filesystem tree intrigues me. My current rsync backups are normally
limited by this very factor. It takes longer to walk the filesystem tree
than it does to transfer the new data.

Would you be willing to provide an example of what you mean when you say
parse/feed the ouput of zfs diff to rsync?

Russ Poyner
David Magda
2013-03-05 19:31:05 UTC
Permalink
Raw Message
Post by Russ Poyner
Your idea to use zfs diff to limit the need to stat the entire
filesystem tree intrigues me. My current rsync backups are normally
limited by this very factor. It takes longer to walk the filesystem tree
than it does to transfer the new data.
Would you be willing to provide an example of what you mean when you say
parse/feed the ouput of zfs diff to rsync?
Don't have anything readily available, or a ZFS system handy to hack
something up. The output of "zfs diff" is roughly:

M /myfiles/
M /myfiles/link_to_me (+1)
R /myfiles/rename_me -> /myfiles/renamed
- /myfiles/delete_me
+ /myfiles/new_file

Take the second column and use that as the list of file to check. Solaris'
zfs(1M) has an "-F" option which would output something like:

M / /myfiles/
M F /myfiles/link_to_me (+1)
R /myfiles/rename_me -> /myfiles/renamed
- F /myfiles/delete_me
+ F /myfiles/new_file
+ | /myfiles/new_pipe

So the second column now has a type, and the path is pushed over to the
third column. This way you can simply choose file ("F") and tell rsync to
use check those.

Bob Friesenhahn
2013-03-05 16:27:25 UTC
Permalink
Raw Message
Post by David Magda
It's also possible to reduce the amount that rsync has to walk the entire
file tree.
Most folks simply do a "rsync --options /my/source/ /the/dest/", but if
you use "zfs diff", and parse/feed the output of that to rsync, then the
amount of thrashing can probably be minimized. Especially useful for file
hierarchies that very many individual files, so you don't have to stat()
every single one.
Zfs diff only works for zfs filesystems. If one is using zfs
filesystems then rsync may not be the best option. In the real world,
data may be sourced from many types of systems and filesystems.

Bob
--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Russ Poyner
2013-03-05 17:29:43 UTC
Permalink
Raw Message
Post by Bob Friesenhahn
Post by David Magda
It's also possible to reduce the amount that rsync has to walk the entire
file tree.
Most folks simply do a "rsync --options /my/source/ /the/dest/", but if
you use "zfs diff", and parse/feed the output of that to rsync, then the
amount of thrashing can probably be minimized. Especially useful for file
hierarchies that very many individual files, so you don't have to stat()
every single one.
Zfs diff only works for zfs filesystems. If one is using zfs
filesystems then rsync may not be the best option. In the real world,
data may be sourced from many types of systems and filesystems.
Bob
Bob,

Good point. Clearly this wouldn't work for my current linux fileserver.
I'm building a replacement that will run FreeBSD 9.1 with a zfs storage
pool. My backups are to a thumper running solaris 10 and zfs in another
department. I have an arm's-length collaboration with the department
that runs the thumper, which likely precludes a direct zfs send.

Rsync has allowed us to transfer data without getting too deep into each
others' system administration. I run an rsync daemon with read only
access to my filesystem that accepts connections from the thumper. They
serve the backups to me via a read-only nfs export. The only problem has
been the iops load generated by my users' millions of small files.
That's why the zfs diff idea excited me, but perhaps I'm missing some
simpler approach.

Russ
Robert Milkowski
2013-02-26 09:43:11 UTC
Permalink
Raw Message
Solaris 11.1 (free for non-prod use).





From: zfs-discuss-***@opensolaris.org
[mailto:zfs-discuss-***@opensolaris.org] On Behalf Of Tiernan OToole
Sent: 25 February 2013 14:58
To: zfs-***@opensolaris.org
Subject: [zfs-discuss] ZFS Distro Advice



Good morning all.



My home NAS died over the weekend, and it leaves me with a lot of spare
drives (5 2Tb and 3 1Tb disks). I have a Dell Poweredge 2900 Server sitting
in the house, which has not been doing much over the last while (bought it a
few years back with the intent of using it as a storage box, since it has 8
Hot Swap drive bays) and i am now looking at building the NAS using ZFS...



But, now i am confused as to what OS to use... OpenIndiana? Nexenta?
FreeNAS/FreeBSD?



I need something that will allow me to share files over SMB (3 if possible),
NFS, AFP (for Time Machine) and iSCSI. Ideally, i would like something i can
manage "easily" and something that works with the Dell...



Any recommendations? Any comparisons to each?



Thanks.
--
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie
Ian Collins
2013-02-26 21:03:32 UTC
Permalink
Raw Message
Post by Robert Milkowski
Solaris 11.1 (free for non-prod use).
But a ticking bomb if you use a cache device.
--
Ian.
Robert Milkowski
2013-02-26 22:40:41 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Robert Milkowski
Solaris 11.1 (free for non-prod use).
But a ticking bomb if you use a cache device.
It's been fixed in SRU (although this is only for customers with a support
contract - still, will be in 11.2 as well).

Then, I'm sure there are other bugs which are fixed in S11 and not in
Illumos (and vice-versa).
--
Robert Milkowski
http://milek.blogspot.com
Ian Collins
2013-02-26 23:04:15 UTC
Permalink
Raw Message
Post by Robert Milkowski
Post by Ian Collins
Post by Robert Milkowski
Solaris 11.1 (free for non-prod use).
But a ticking bomb if you use a cache device.
It's been fixed in SRU (although this is only for customers with a support
contract - still, will be in 11.2 as well).
Then, I'm sure there are other bugs which are fixed in S11 and not in
Illumos (and vice-versa).
There may well be, but in seven+ years of using ZFS, this was the first
one to cost me a pool.
--
Ian.
Loading...