Discussion:
user undo
(too old to reply)
Jeremy Teo
2006-05-24 15:41:06 UTC
Permalink
Hello,

with reference to bug id #4852821: user undo

I have implemented a basic prototype that has the current functionality:

1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted
Unfortunately, it is non-trivial to completely reproduce the namespace
of deleted files: for now, deleting "/foo/bar" will result in
".zfs/deleted/bar".
2) As a result of 1, deleted files move out of .zfs/deleted in FIFO.
Ie. if you remove /foo/bar twice, the most recent copy will be the one
remaining in .zfs/deleted.
3) If another user deletes /foo/bar, and you try to delete /foo/bar,
you will be denied permissions. Again, this is due to namespace
clashes.

I'm leaning towards completely reproducing the namespace, but would
like to get a feel for whether the benefits outweigh the code
complexity. Advice would be appreciated.

Also, I presume I can request-sponsor for 4852821 and get someone from
the zfs team to mentor me?

Thanks again for all your time. :)
--
Regards,
Jeremy
Darren J Moffat
2006-05-24 15:58:16 UTC
Permalink
Post by Jeremy Teo
Hello,
with reference to bug id #4852821: user undo
1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted
Unfortunately, it is non-trivial to completely reproduce the namespace
of deleted files: for now, deleting "/foo/bar" will result in
".zfs/deleted/bar".
I had exactly the same issue when I implemented this for ext2 years ago
:-) [ I never finished it - got bored with it since I didn't actually
need it myself :-) ].
Post by Jeremy Teo
2) As a result of 1, deleted files move out of .zfs/deleted in FIFO.
Ie. if you remove /foo/bar twice, the most recent copy will be the one
remaining in .zfs/deleted.
That seems okay, and you have that issue even if you do reproduce the
namespace, for example:

$ rm foo/bar
< you now have .zfs/deleted/foo/bar >
$ mkdir foo/bar
< you now have .zfs/deleted/foo/bar and foo/bar >
$ rm foo/bar

Now what do we do ? the FIFO seems reasonable to me, if you need better
than that use snapshots.
Post by Jeremy Teo
3) If another user deletes /foo/bar, and you try to delete /foo/bar,
you will be denied permissions. Again, this is due to namespace
clashes.
Thats not good and I think this would violate POSIX requirements for
unlink(2).
Post by Jeremy Teo
I'm leaning towards completely reproducing the namespace, but would
like to get a feel for whether the benefits outweigh the code
complexity. Advice would be appreciated.
POSIX compliance is a must. The FIFO idea actually sounds pretty good.
How does this interact with snapshots ?
Post by Jeremy Teo
Also, I presume I can request-sponsor for 4852821 and get someone from
the zfs team to mentor me?
The request-sponsor is really for when you are done. If you want code
help I've found asking on zfs-***@opensolaris.org gets great feedback.
--
Darren J Moffat
James Dickens
2006-05-24 16:22:13 UTC
Permalink
Post by Jeremy Teo
Hello,
with reference to bug id #4852821: user undo
1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted
Unfortunately, it is non-trivial to completely reproduce the namespace
of deleted files: for now, deleting "/foo/bar" will result in
".zfs/deleted/bar".
2) As a result of 1, deleted files move out of .zfs/deleted in FIFO.
Ie. if you remove /foo/bar twice, the most recent copy will be the one
remaining in .zfs/deleted.
3) If another user deletes /foo/bar, and you try to delete /foo/bar,
you will be denied permissions. Again, this is due to namespace
clashes.
how about changing the name of the file to uid or username-filename
this atleast gets you the ability to let each user the ability to
delete there own file, shouldn't be much work. Another possible
enhancement would be adding anything field in stat(stat) in the files
name after its deleted. This would be set per filesystem. mod, uid,
username(the code should do the conversion), gid, size, mtime and just
parse a format string like $mtime-$name.

James Dickens
uadmin.blogspot.com
Post by Jeremy Teo
I'm leaning towards completely reproducing the namespace, but would
like to get a feel for whether the benefits outweigh the code
complexity. Advice would be appreciated.
Also, I presume I can request-sponsor for 4852821 and get someone from
the zfs team to mentor me?
Thanks again for all your time. :)
--
Regards,
Jeremy
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Bill Sommerfeld
2006-05-24 17:33:13 UTC
Permalink
Post by James Dickens
how about changing the name of the file to uid or username-filename
this atleast gets you the ability to let each user the ability to
delete there own file, shouldn't be much work. Another possible
enhancement would be adding anything field in stat(stat) in the files
name after its deleted. This would be set per filesystem. mod, uid,
username(the code should do the conversion), gid, size, mtime and just
parse a format string like $mtime-$name.
A number of (generally older) systems have had the concept of numbered
file versions. (I recall seeing this during casual use of ITS, TOPS-20,
and VMS;
GNU Emacs and its derivatives emulate this via the use of .~NN~ backup
copies, but this "pollutes" the directory namespace).

Adding a version/generation number to the filename in the "deleted"
directory would allow multiple versions to coexist.

It might also make sense to populate the "deleted" directory with an
older version when file contents are deleted via an

open(..., ...|O_TRUNC)

or when a file is deleted via rename.

- Bill
Erik Trimble
2006-05-24 18:22:23 UTC
Permalink
Ummmm.

Remind me why we should support "undo" (or, more aptly named, "safe
delete") in ZFS?

Isn't this an application feature, not a filesystem feature? I would
expect something like this behavior when using Nautilus, but certainly
not when using "rm".

That is, maybe there should be a library which has a "safe delete"
system call for use by applications, and has code specific to the
various filesystems to implement the feature, but I can't really see the
point in implementing "safe delete" at the filesystem level. It screws
with too many long-standing assumptions.



If you want to do avoid a namespace collision, you'll probably have to
implement the "recycle bin" as a DB and file collection. Move the file
being deleted over to /your_pool/your_fs/.zfs/deleted/, and rename it by
a unique ID (whatever the ZFS equivalent of Inodes is, for example).
Keep the ACL/permissions on the file, but put the complete pathname
(relative to the root of the filesystem) [and, maybe something like
create_date as well] in a hashtable with the ID key. For directories,
you might need something a little more fancy in the DB (like keeping the
full ACL/perm metadata there). Thus, you'd end up with something like:

% ls /your_pool/your_fs/.zfs/deleted/
files.db
dirs.db
10021238
01924132
13243542




And, once again, I've totally forgotten where the ZFS bug list lives.
Pointers, please?
--
Erik Trimble
Java System Support
Mailstop: usca14-102
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
Eric Schrock
2006-05-24 18:31:40 UTC
Permalink
Post by Erik Trimble
Ummmm.
Remind me why we should support "undo" (or, more aptly named, "safe
delete") in ZFS?
Isn't this an application feature, not a filesystem feature? I would
expect something like this behavior when using Nautilus, but certainly
not when using "rm".
This is exactly why it should be supported. It is
application-independent. If you count the number of Solaris users who
type 'rm' versus the number that click-and-drag files to the trash bin,
I'd wager that you'd find many, many, orders of magnitude more folks who
don't _want_ to rely on application features. The recycle bin is also
per-user, not per-filesystem. The location of the copy is dependent on
who did the original deletion, and may not be accessible (i.e. over NFS)
in the same way as the original filesystem.
Post by Erik Trimble
That is, maybe there should be a library which has a "safe delete"
system call for use by applications, and has code specific to the
various filesystems to implement the feature, but I can't really see the
point in implementing "safe delete" at the filesystem level. It screws
with too many long-standing assumptions.
You don't have to have use it, it would be a property like anything else
in ZFS, and one which would default to 'off'. It obviously cannot be on
by default because it would violate too many POSIX rules.

- Eric

--
Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Erik Trimble
2006-05-24 19:18:48 UTC
Permalink
Post by Eric Schrock
Post by Erik Trimble
Isn't this an application feature, not a filesystem feature? I would
expect something like this behavior when using Nautilus, but certainly
not when using "rm".
This is exactly why it should be supported. It is
application-independent. If you count the number of Solaris users who
type 'rm' versus the number that click-and-drag files to the trash bin,
I'd wager that you'd find many, many, orders of magnitude more folks who
don't _want_ to rely on application features. The recycle bin is also
per-user, not per-filesystem. The location of the copy is dependent on
who did the original deletion, and may not be accessible (i.e. over NFS)
in the same way as the original filesystem.
But my point being that "undo" is appropriate at the APPLICATION level,
not the FILESYSTEM level. An application (whether Nautilus or "rm")
should have the ability to call a system library to support "undo",
which has the relevant code, but ZFS itself should have no concept of
"undo". This keeps the applications FS-agnostic, so you support "undo"
across ZFS, UFS, NFS, etc.

So, our mythical system library (libundelete.so) should support a couple
of generic functions (say: int safe_unlink(const char *path), and void
empty_recyclebin(const char *path) which look for an ENV variable to
determine if they should recycle or should delete, as appropriate) for
applications to call, and then the library code has to support
implementing this on various FSes.


Maybe, MAYBE, after we implement a generic system library call to
support "undo" across all (reasonable) FSes, we consider putting in
"undo" in the actual FS for performance reasons, so that the library can
simply call the FS libraries to do the "undo".
--
Erik Trimble
Java System Support
Mailstop: usca14-102
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
Eric Schrock
2006-05-24 21:43:38 UTC
Permalink
Post by Erik Trimble
But my point being that "undo" is appropriate at the APPLICATION level,
not the FILESYSTEM level. An application (whether Nautilus or "rm")
should have the ability to call a system library to support "undo",
which has the relevant code, but ZFS itself should have no concept of
"undo". This keeps the applications FS-agnostic, so you support "undo"
across ZFS, UFS, NFS, etc.
So, our mythical system library (libundelete.so) should support a couple
of generic functions (say: int safe_unlink(const char *path), and void
empty_recyclebin(const char *path) which look for an ENV variable to
determine if they should recycle or should delete, as appropriate) for
applications to call, and then the library code has to support
implementing this on various FSes.
Maybe, MAYBE, after we implement a generic system library call to
support "undo" across all (reasonable) FSes, we consider putting in
"undo" in the actual FS for performance reasons, so that the library can
simply call the FS libraries to do the "undo".
No, this is not the point of this RFE. We are not trying to implement a
wide-ranging subsystem that understands how to manage semantically valid
undo points. This would never, ever, be supported by any significant
number of applications, and is probably impossible at the filesystem
level.

The point is rather to provide "undelete", which will account for 99% of
all the times that someone would want to have 'undo'. This is a vastly
simpler problem, and probably more useful. Feel free to think of it as
'undelete' instead of 'undo' if it makes things easier.

- Eric

--
Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Nicolas Williams
2006-05-24 22:21:34 UTC
Permalink
Post by Eric Schrock
No, this is not the point of this RFE. We are not trying to implement a
wide-ranging subsystem that understands how to manage semantically valid
undo points. This would never, ever, be supported by any significant
number of applications, and is probably impossible at the filesystem
level.
The point is rather to provide "undelete", which will account for 99% of
all the times that someone would want to have 'undo'. This is a vastly
simpler problem, and probably more useful. Feel free to think of it as
'undelete' instead of 'undo' if it makes things easier.
While we can probably make some 'versions' of files, deleted or
otherwise, naturally show up in .zfs/.deleted/.version/.something
directories, I wonder if we might not want an API that could let one get
at all 'versions' (one version per-txg) of a file still available --
i.e., going backwards in the transaction group history, if all old
blocks for a given 'version' of a file are still not reclaimed, you can
then re-create the file.

ACLs get interesting.

For deleted files that should up in <root>/.zfs/deleted we have the
problem that directory permissions in the path(s) to the file are lost,
so the deleted file name really needs to be something not meaningful to
humans (say, dnode/gen numbers), and any indexing needs to be per-file
owner.

For file versions available through some API which ACL should be
checked? The current one, or the old one(s), or both/all?

An API might evolve to allow for per-file snapshots/clones.

Nico
--
Erik Trimble
2006-05-24 22:26:07 UTC
Permalink
Post by Eric Schrock
No, this is not the point of this RFE. We are not trying to implement a
wide-ranging subsystem that understands how to manage semantically valid
undo points. This would never, ever, be supported by any significant
number of applications, and is probably impossible at the filesystem
level.
The point is rather to provide "undelete", which will account for 99% of
all the times that someone would want to have 'undo'. This is a vastly
simpler problem, and probably more useful. Feel free to think of it as
'undelete' instead of 'undo' if it makes things easier.
- Eric
Sorry, semantics on my part. I mean "undelete", in a manner identical to
having the Recycling Bin functionality of Nautilus or Windows Explorer.
That is, when you "delete" a file, it is actually moved aside to some
hidden place, where it can be recovered easily by another command.

All my arguments are concerning this kind of functionality, which I'm
trying to say belongs up in the app. Otherwise, it gets _very_
confusing.


Let's say that you implement "undelete" in ZFS, which, in order to work,
has to (a) be an enabled attribute of the ZFS pool or filesystem, and
(B) uses some sort of an ENV var to indicate that a given user's tools
will do "undelete" instead of permanent remove.

Now, you end up with a situation where behavior of an app varies
significantly across filesystem boundaries, which are _supposed_ to be
invisible to the end-user. That is, the behavior of "rm" varies
according to where in the filesystem tree I sit. Additionally, it
doesn't allow for variation; that is, deleting a file via "rm" and
"nautilus" does the exact same thing, even if I wanted "rm" to actually
remove the file and not just send it to the recycle bin.


Rather, I would submit that for better consistency, having a new global
libundelete.so (containing a modified "unlink") which implements
"undelete" in a FS-agnostic way is better. You get the feature across
all Filesystem Types that way, and it's portable. It would also allow
apps to decide if they want to support "undelete" or vanilla "unlink" on
an app-by-app basis. The apps would have to link against the new
libundelete.so to get the functionality, which I think is reasonable.
--
Erik Trimble
Java System Support
Mailstop: usca14-102
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
Nathan Kroenert
2006-05-24 23:36:32 UTC
Permalink
Cool -

I can see my old fav's from Netware 3.12 making a comeback.

It was always great to be able to salvage things from a disk that
someone did not mean to kill. :)

ah - salvage - my old friend...

Does this also usher in the return of purge too? :)

Nathan.
Post by Erik Trimble
Post by Eric Schrock
No, this is not the point of this RFE. We are not trying to implement a
wide-ranging subsystem that understands how to manage semantically valid
undo points. This would never, ever, be supported by any significant
number of applications, and is probably impossible at the filesystem
level.
The point is rather to provide "undelete", which will account for 99% of
all the times that someone would want to have 'undo'. This is a vastly
simpler problem, and probably more useful. Feel free to think of it as
'undelete' instead of 'undo' if it makes things easier.
- Eric
Sorry, semantics on my part. I mean "undelete", in a manner identical to
having the Recycling Bin functionality of Nautilus or Windows Explorer.
That is, when you "delete" a file, it is actually moved aside to some
hidden place, where it can be recovered easily by another command.
All my arguments are concerning this kind of functionality, which I'm
trying to say belongs up in the app. Otherwise, it gets _very_
confusing.
Let's say that you implement "undelete" in ZFS, which, in order to work,
has to (a) be an enabled attribute of the ZFS pool or filesystem, and
(B) uses some sort of an ENV var to indicate that a given user's tools
will do "undelete" instead of permanent remove.
Now, you end up with a situation where behavior of an app varies
significantly across filesystem boundaries, which are _supposed_ to be
invisible to the end-user. That is, the behavior of "rm" varies
according to where in the filesystem tree I sit. Additionally, it
doesn't allow for variation; that is, deleting a file via "rm" and
"nautilus" does the exact same thing, even if I wanted "rm" to actually
remove the file and not just send it to the recycle bin.
Rather, I would submit that for better consistency, having a new global
libundelete.so (containing a modified "unlink") which implements
"undelete" in a FS-agnostic way is better. You get the feature across
all Filesystem Types that way, and it's portable. It would also allow
apps to decide if they want to support "undelete" or vanilla "unlink" on
an app-by-app basis. The apps would have to link against the new
libundelete.so to get the functionality, which I think is reasonable.
Mike Gerdts
2006-05-25 00:10:52 UTC
Permalink
Post by Erik Trimble
So, our mythical system library (libundelete.so) should support a couple
of generic functions (say: int safe_unlink(const char *path), and void
empty_recyclebin(const char *path) which look for an ENV variable to
determine if they should recycle or should delete, as appropriate) for
applications to call, and then the library code has to support
implementing this on various FSes.
If it were unlink(3C) rather than unlink(2), an interposer library
could make this functionality generally available. Surely there must
be a dtrace hack that could redirect calls destined for unlink() to
safe_unlink(), subject to environment information.

I suspect, however, that protecting every file from deletion may be a
bit aggressive. Consider "here documents" from shell scripts, browser
cache files, compiler temp files, etc.

Mike
--
Mike Gerdts
http://mgerdts.blogspot.com/
Nicolas Williams
2006-05-25 03:35:12 UTC
Permalink
Post by Mike Gerdts
If it were unlink(3C) rather than unlink(2), an interposer library
could make this functionality generally available. Surely there must
be a dtrace hack that could redirect calls destined for unlink() to
safe_unlink(), subject to environment information.
You most certainly can interpose on system calls just as with any C
function calls -- after all applications have to call function stubs for
them that in turn do the actual trapping to the kernel.

Nico
--
Nicolas Williams
2006-05-24 17:56:35 UTC
Permalink
Other possibilities:

- put a .deleted directory in every directory (not on by default, for
POSIX compliance)

- put a link in .deleted named after the file's dnode and append a text
({fname, dnode#}) entry to a log file so it can more easily be found

Ultimately deleted files' space has to be reclaimed though, so something
has to delete .deleted files, no?

Nico
--
Joerg Schilling
2006-05-25 12:03:35 UTC
Permalink
Post by Jeremy Teo
Hello,
with reference to bug id #4852821: user undo
1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted
Unfortunately, it is non-trivial to completely reproduce the namespace
of deleted files: for now, deleting "/foo/bar" will result in
".zfs/deleted/bar".
2) As a result of 1, deleted files move out of .zfs/deleted in FIFO.
Ie. if you remove /foo/bar twice, the most recent copy will be the one
remaining in .zfs/deleted.
3) If another user deletes /foo/bar, and you try to delete /foo/bar,
you will be denied permissions. Again, this is due to namespace
clashes.
How about appending the decimal inode number to the file name?

Jörg
--
EMail:***@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
***@cs.tu-berlin.de (uni)
***@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Mark Shellenbaum
2006-05-25 14:11:15 UTC
Permalink
Post by Joerg Schilling
Post by Jeremy Teo
Hello,
with reference to bug id #4852821: user undo
1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted
Unfortunately, it is non-trivial to completely reproduce the namespace
of deleted files: for now, deleting "/foo/bar" will result in
".zfs/deleted/bar".
2) As a result of 1, deleted files move out of .zfs/deleted in FIFO.
Ie. if you remove /foo/bar twice, the most recent copy will be the one
remaining in .zfs/deleted.
3) If another user deletes /foo/bar, and you try to delete /foo/bar,
you will be denied permissions. Again, this is due to namespace
clashes.
How about appending the decimal inode number to the file name?
Anything that attempts to append characters on the end of the filename
will run into trouble when the file name is already at NAME_MAX.

-Mark
Anton B. Rang
2006-05-26 15:23:05 UTC
Permalink
Post by Mark Shellenbaum
Anything that attempts to append characters on the end of the filename
will run into trouble when the file name is already at NAME_MAX.
One simple solution is to restrict the total length of the name to NAME_MAX, truncating the original filename as necessary to allow appending. This does introduce the possibility of conflicts with very long names which happen to end in numeric strings, but that is likely to be rare and could be resolved in an ad hoc fashion (e.g. flipping a bit in the representation of "inode number" until a unique name is achieved).


This message posted from opensolaris.org
Constantin Gonzalez
2006-05-29 08:50:44 UTC
Permalink
Hi,

the current discussion on how to implement "undo" seems to circulate around
concepts and tweaks for replacing any "rm" like action with "mv" and then
fix the problems associated with namespaces, ACLs etc.

Why not use snapshots?

A snapshot-oriented implementation of undo would:

- Create a snapshot of the FS whenever anything is attempted that someone
might want to undo. This could be done even at the most fundamental level
(i.e. before any "zpool" or "zfs" command, where the potential damage to
be undone is biggest).

- The undo-feature would then exchange the live FS with the snapshot taken
prior to the revoked action. Just tweak one or two pointers and the undo
is done.

- This would transparently work with any app, user action, even admin action,
depending on where the snapshotting code would be hooked up to.

- As an alternative to undo, the user can browse the .zfs hierarchy in search
of that small file which got lost in an rm -rf orgy without having to restore
the snapshot with all the other unwanted files.

- When ZFS wants to reclaim blocks, it would start deleting the oldest
undo-snapshots.

- To separate undo-snapshots from user-triggered ones, the undo-code could
place its snapshots in .zfs/snapshots/undo .

Did I miss something why undo can't be implemented with snapshots?

Best regards,
Constantin
--
Constantin Gonzalez Sun Microsystems GmbH, Germany
Platform Technology Group, Client Solutions http://www.sun.de/
Tel.: +49 89/4 60 08-25 91 http://blogs.sun.com/constantin/
Jeremy Teo
2006-05-29 10:12:41 UTC
Permalink
Hello Constantin,
Post by Constantin Gonzalez
Hi,
the current discussion on how to implement "undo" seems to circulate around
concepts and tweaks for replacing any "rm" like action with "mv" and then
fix the problems associated with namespaces, ACLs etc.
Why not use snapshots?
I hadn't considered that yet: I was quite fixated on solving this at
the ZPL level. :(
Post by Constantin Gonzalez
- Create a snapshot of the FS whenever anything is attempted that someone
might want to undo. This could be done even at the most fundamental level
(i.e. before any "zpool" or "zfs" command, where the potential damage to
be undone is biggest).
- The undo-feature would then exchange the live FS with the snapshot taken
prior to the revoked action. Just tweak one or two pointers and the undo
is done.
- This would transparently work with any app, user action, even admin action,
depending on where the snapshotting code would be hooked up to.
- As an alternative to undo, the user can browse the .zfs hierarchy in search
of that small file which got lost in an rm -rf orgy without having to restore
the snapshot with all the other unwanted files.
- When ZFS wants to reclaim blocks, it would start deleting the oldest
undo-snapshots.
- To separate undo-snapshots from user-triggered ones, the undo-code could
place its snapshots in .zfs/snapshots/undo .
Did I miss something why undo can't be implemented with snapshots?
No you didn't. Given the points you have raised, this does seem like
the way to go.

I'll dig around in the snapshot code and see if I can whip up
something more lightweight ie. a more targetted snapshot that can
snapshot a subset of the original filesystem.

Thanks for the suggestion! :)
--
Regards,
Jeremy
Neil Perrin
2006-05-30 02:52:03 UTC
Permalink
Post by Constantin Gonzalez
Hi,
the current discussion on how to implement "undo" seems to circulate around
concepts and tweaks for replacing any "rm" like action with "mv" and then
fix the problems associated with namespaces, ACLs etc.
Why not use snapshots?
- Create a snapshot of the FS whenever anything is attempted that someone
might want to undo. This could be done even at the most fundamental level
(i.e. before any "zpool" or "zfs" command, where the potential damage to
be undone is biggest).
- The undo-feature would then exchange the live FS with the snapshot taken
prior to the revoked action. Just tweak one or two pointers and the undo
is done.
- This would transparently work with any app, user action, even admin action,
depending on where the snapshotting code would be hooked up to.
- As an alternative to undo, the user can browse the .zfs hierarchy in search
of that small file which got lost in an rm -rf orgy without having to restore
the snapshot with all the other unwanted files.
- When ZFS wants to reclaim blocks, it would start deleting the oldest
undo-snapshots.
- To separate undo-snapshots from user-triggered ones, the undo-code could
place its snapshots in .zfs/snapshots/undo .
Did I miss something why undo can't be implemented with snapshots?
Well creating a snapshot isn't exactly free. It requires flushing out
all the current in progress transactions to ensure the containing
transaction group is committed on disk. This can be quick or can take
a few seconds depending on the current load. So it isn't practical to
snapshot before every remove - but perhaps a courser grain might work?
Post by Constantin Gonzalez
Best regards,
Constantin
--
Neil
Erik Trimble
2006-05-30 06:00:29 UTC
Permalink
Once again, I hate to be a harpy on this one, but are we really
convinced that having a "undo" (I'm going to call is RecycleBin from now
on) function for file deletion built into ZFS is a good thing?

Since I've seen nothing to the contrary, I'm assuming that we're doing
this by changing the actual effects of an "unlink(2)" sys lib call
against a file in ZFS, and having some other library call added to take
care of actual deletion.

Even with it being a ZFS option parameter, I can see soooo many places
that it breaks assumptions and causes problems that I can't think it's a
good thing to blindly turn on for everything.

And, I've still not seen a good rebuttal to the idea of moving this up
to the Application level, and using a new library to implements the
functionality (and requires Apps to specifically (and explicitly)
support RecycleBin in the design).



You will notice that Windows does this. The Recycle Bin is usable from
within Windows Explorer, but if you use "del" from a command prompt, it
actually deletes the file. I see no reason why we shouldn't support the
same functionality (i.e. RecycleBin from within Nautilus (as it already
does), and true deletion via "rm").



-Erik
Constantin Gonzalez Schmitz
2006-05-30 14:50:01 UTC
Permalink
Hi,

so we have two questions:

1. Is it really ZFS' job to provide an undo functionality?

2. If it turns out to be a feature that needs to be implemented by
ZFS, what is the better approach: Snapshot based or file-based?

My personal opinion on 1) is:

- The purpose of any Undo-like action is to provide a safety net to the user
in case she commits an error that she wants to undo.

- So, it depends on how we define "user" here. If by user we mean your regular
file system user with a GUI etc., then of course it's a matter of the
application.

- But if user=sysadmin, I guess a more fundamental way of implementing "undo" is
in order. We could either restrict the undo functionality to some admin
interface and force admins to use just that, then it would still be a feature
that the admin interface needs to implement.

But in order to save all admins from shooting themselves into their knees, the
best way would be to provide an admin-savvy safety net.

- Now, coming from the other side, ZFS provides a nice and elegant way of
implementing snapshots. That's where I count 1+1: If ZFS knew how to do
snapshots right before any significant administrator or user action and if
ZFS had a way of managing those snapshots so admins and users could easily
undo any action (including zfs destroy, zpool destroy, or just rm -rf /*),
then the benefit/investment ratio for implementing such a feature should
be extremely interesting.

One more step towards a truly foolproof filesystem.

But: If it turns out that providing an undo function via snapshots is not
possible/elegantly feasible/cheap or if there's any significant roadblock that
prevents ZFS from providing an undo feature in an elegant way, then it might not
be a good idea after all and we should just forget it.

So I guess it boils down to: Can the ZFS framework be used to implement an undo
feature much more elegantly than your classic filemanager while extending the
range of undo customers to even the CLI based admin?

Best regards,
Constantin
Post by Erik Trimble
Once again, I hate to be a harpy on this one, but are we really
convinced that having a "undo" (I'm going to call is RecycleBin from now
on) function for file deletion built into ZFS is a good thing?
Since I've seen nothing to the contrary, I'm assuming that we're doing
this by changing the actual effects of an "unlink(2)" sys lib call
against a file in ZFS, and having some other library call added to take
care of actual deletion.
Even with it being a ZFS option parameter, I can see soooo many places
that it breaks assumptions and causes problems that I can't think it's a
good thing to blindly turn on for everything.
And, I've still not seen a good rebuttal to the idea of moving this up
to the Application level, and using a new library to implements the
functionality (and requires Apps to specifically (and explicitly)
support RecycleBin in the design).
You will notice that Windows does this. The Recycle Bin is usable from
within Windows Explorer, but if you use "del" from a command prompt, it
actually deletes the file. I see no reason why we shouldn't support the
same functionality (i.e. RecycleBin from within Nautilus (as it already
does), and true deletion via "rm").
-Erik
--
Constantin Gonzalez Sun Microsystems GmbH, Germany
Platform Technology Group, Client Solutions http://www.sun.de/
Tel.: +49 89/4 60 08-25 91 http://blogs.sun.com/constantin/
Tim Foster
2006-05-30 15:57:54 UTC
Permalink
hey All,
Post by Constantin Gonzalez Schmitz
- The purpose of any Undo-like action is to provide a safety net to the user
in case she commits an error that she wants to undo.
So, what if the user was able to specify which applications they wanted
such a safety net for (thus lessening the load on the filesystem,
watching *every* delete) - or were able to specify a few sub-directories
they wanted to take special care with ?

[ eg. "ZFS, please provide me undo capability for files
in /home/timf/Documents/plans-to-takeover-world when I'm using
nautilus" ]

With a tiny bit of DTrace hackery, you could have something like:

------ snapshot-on-delete.d --------
#!/usr/sbin/dtrace -qws

syscall::unlink:entry
/pid==$1/
{
this->file = copyinstr(arg0);
system ("/usr/sbin/take-undo-snapshot.sh %s",this->file);
}
------------------------------------

Something like:

% ./snapshot-on-delete.d `pgrep nautilus`

Where the shell script "take-undo-snapshot.sh" would take another
snapshot in some known namespace, up to some a pre-defined limit, if
that file was found to be resident on a zfs filesystem (and optionally
in some given directory)


Now, it probably will scale badly if you have hundreds of users, running
hundreds of applications, each one invoking a shell script on each file
delete, and as Neil pointed out, many many snapshots aren't cheap. But
as a proof-of-concept, this would work fine.

It'd be interesting to see how badly people wanted this functionality,
before boiling the ocean (again!) to provide it :-)


Of course, "redo" is a little trickier, as your application would need
to know about the snapshot namespace, but at least your data is safe.

cheers,
tim
Post by Constantin Gonzalez Schmitz
- So, it depends on how we define "user" here. If by user we mean your regular
file system user with a GUI etc., then of course it's a matter of the
application.
- But if user=sysadmin, I guess a more fundamental way of implementing "undo" is
in order. We could either restrict the undo functionality to some admin
interface and force admins to use just that, then it would still be a feature
that the admin interface needs to implement.
But in order to save all admins from shooting themselves into their knees, the
best way would be to provide an admin-savvy safety net.
- Now, coming from the other side, ZFS provides a nice and elegant way of
implementing snapshots. That's where I count 1+1: If ZFS knew how to do
snapshots right before any significant administrator or user action and if
ZFS had a way of managing those snapshots so admins and users could easily
undo any action (including zfs destroy, zpool destroy, or just rm -rf /*),
then the benefit/investment ratio for implementing such a feature should
be extremely interesting.
One more step towards a truly foolproof filesystem.
But: If it turns out that providing an undo function via snapshots is not
possible/elegantly feasible/cheap or if there's any significant roadblock that
prevents ZFS from providing an undo feature in an elegant way, then it might not
be a good idea after all and we should just forget it.
So I guess it boils down to: Can the ZFS framework be used to implement an undo
feature much more elegantly than your classic filemanager while extending the
range of undo customers to even the CLI based admin?
Best regards,
Constantin
Post by Erik Trimble
Once again, I hate to be a harpy on this one, but are we really
convinced that having a "undo" (I'm going to call is RecycleBin from now
on) function for file deletion built into ZFS is a good thing?
Since I've seen nothing to the contrary, I'm assuming that we're doing
this by changing the actual effects of an "unlink(2)" sys lib call
against a file in ZFS, and having some other library call added to take
care of actual deletion.
Even with it being a ZFS option parameter, I can see soooo many places
that it breaks assumptions and causes problems that I can't think it's a
good thing to blindly turn on for everything.
And, I've still not seen a good rebuttal to the idea of moving this up
to the Application level, and using a new library to implements the
functionality (and requires Apps to specifically (and explicitly)
support RecycleBin in the design).
You will notice that Windows does this. The Recycle Bin is usable from
within Windows Explorer, but if you use "del" from a command prompt, it
actually deletes the file. I see no reason why we shouldn't support the
same functionality (i.e. RecycleBin from within Nautilus (as it already
does), and true deletion via "rm").
-Erik
--
Tim Foster, Sun Microsystems Inc, Operating Platforms Group
Engineering Operations http://blogs.sun.com/timf
Erik Trimble
2006-05-30 17:48:12 UTC
Permalink
(I'm going to combine Constantine & Eric's replies together, so I
Post by Constantin Gonzalez Schmitz
Hi,
1. Is it really ZFS' job to provide an undo functionality?
2. If it turns out to be a feature that needs to be implemented by
ZFS, what is the better approach: Snapshot based or file-based?
- The purpose of any Undo-like action is to provide a safety net to the user
in case she commits an error that she wants to undo.
- So, it depends on how we define "user" here. If by user we mean your regular
file system user with a GUI etc., then of course it's a matter of the
application.
Agreed. :-)
Post by Constantin Gonzalez Schmitz
- But if user=sysadmin, I guess a more fundamental way of implementing "undo" is
in order. We could either restrict the undo functionality to some admin
interface and force admins to use just that, then it would still be a feature
that the admin interface needs to implement.
But in order to save all admins from shooting themselves into their knees, the
best way would be to provide an admin-savvy safety net.
As an admin, that certainly sounds noble. However, given Eric's
Post by Constantin Gonzalez Schmitz
No. The idea is a FIFO queue, bounded in space. There is no explicit
'actual deletion'. Things just pass in one way and out the other once
the space is needed. If you accidentally delete something, you can
quickly go back and get it, but it's not a replacement for regular
# zfs set undelete=1m home/eschrock
I can't imagine that any admin would set this FIFO space anything more
than a very small amount. Personally, there will always be pressure from
the user base to have more usable disk space (without actually buying
more disk), so best case I can picture is that the FIFO is under 1GB for
a TB filesystem. Now, for critical filesystems (such as root), which
have a relatively fixed size, I could see setting it to say 100mb for a
10GB root partition.

The problem here is that it's _very_ easy to blow right through the FIFO
size limit with just a single "rm -rf". Or, if you are on a filesystem
that has multiple users, the likelihood that several of us combine to
exceed the limit (and, thus, making it likely that what you wanted to
"restore" is gone for good) is much higher. This limits the usefulness
of the feature considerably.
Post by Constantin Gonzalez Schmitz
- Now, coming from the other side, ZFS provides a nice and elegant way of
implementing snapshots. That's where I count 1+1: If ZFS knew how to do
snapshots right before any significant administrator or user action and if
ZFS had a way of managing those snapshots so admins and users could easily
undo any action (including zfs destroy, zpool destroy, or just rm -rf /*),
then the benefit/investment ratio for implementing such a feature should
be extremely interesting.
One more step towards a truly foolproof filesystem.
The problem is that you are attempting to divine user INTENTIONS (did I
_really_ want to do that). That's a losing proposition. You will always
be wrong (at best) in a significant minority of the time, which will be
far above the people's threshold for tolerance. Take a look at the "-i"
function for rm. Do you know how annoying that is for an admin (and,
that virtually no admin ever uses it)? Yet, it provides about the same
level of protection as "undo" would.

Part of being an administrator is learning proper procedures. One of the
biggest problems with Windows is that it provides the ILLUSION that any
JoeBlow can be an administrator. Yet, time and time again Windows has
huge failures linked directly to incompetent (or untrained)
administrators. We don't want to go down this road with Solaris.

Providing tools which give an 80% solution is generally very useful in
the user space, but is VERY frustrating (and, in my opinion,
counterproductive) in the admin space.

Accidental file deletion (or, as Constantine pointed advanced, other
admin commands such as "zpool destroy") is a problem. HOWEVER, you want
to provide only a 100% solution to the problem. Am I going to like a
solution which "sorta-kinda-might" restore the file or one which WILL
restore the file.

What I'm trying to say is that a competent admin will STILL have to
maintain version controlled config files, and back things up on a
regular basis. ZFS snapshots are very nice in these cases, as they
provide a PERMANENT picture of things before changes are made. ZFS undo
doesn't alleviate the need to do any of this, but it CAN provide the
ILLUSION that you can skip this.

Short of a full version control and complete history retention for all
files in a ZFS filesystem, filesystem-level undo isn't a good idea. The
person asking for the RFE doesn't (in my opinion) have a competent admin
staff, and is asking for a BandAid to a problem requiring skilled
surgeons.

Yes, that is an exaggeration, and yes, there are times when I've
fat-fingered something and said "aaaargh. I want that back right now!".
But, once again, proper sysadmin procedures already protect one from
that. Guaranteed.
Post by Constantin Gonzalez Schmitz
But: If it turns out that providing an undo function via snapshots is not
possible/elegantly feasible/cheap or if there's any significant roadblock that
prevents ZFS from providing an undo feature in an elegant way, then it might not
be a good idea after all and we should just forget it.
So I guess it boils down to: Can the ZFS framework be used to implement an undo
feature much more elegantly than your classic filemanager while extending the
range of undo customers to even the CLI based admin?
Best regards,
Constantin
- doesn't work over NFS/CIFS (recycle bin 'location' may not be
accessible on all hosts, or may require cross-network traffic to
delete a file).
So, how am I supposed to get back a file from your proposed "undo"
solution? Does every directory have a .zfs/deleted directory inside it,
with the "temporarily deleted" files residing in there? If not, how do
I access a ZFS undo directory across NFS? If so, how is this any
different than having a .recyclebin directory for each normal
directory?
Post by Constantin Gonzalez Schmitz
- inherently user-centric, not filesystem-centric (location of stored
file depends on who's doing the deletion)
with Application-level RecycleBin, the assumption is that all the
clients have a common API to call (some lib installed locally), that
implements the generic RecycleBin technology across all filesystem
types. Take a look at Samba's implementation of RecycleBin - it works
just fine for user-centric applications.
Post by Constantin Gonzalez Schmitz
- requires application changes (which means it can _NEVER_ scale >
beyond
Post by Constantin Gonzalez Schmitz
a handful of apps)
- ignores the predominant need (accidental 'rm')
Which is EXACTLY what is needed. The RFE is for "accidental rm". There
are only going to be a small handful of applications that really can
make use of an "undo" function. Those apps are going to be the ones
which allow the user to directly manipulate the filesystem.

Put it another way: look at the $HOME/tmp directory. How many
applications create/delete files there on a very regular basis? You
sure you want ZFS to put those into the FIFO "undo"-reserved space?

Or /var/run. Or /tmp. Or any of a dozen different Applications which
regularly create and delete files themselves, without any user
intervention. If I enable ZFS undo on the / filesystem to protect
myself, think of all the apps which do this kind of create/delete to
things. So, how long will your FIFO last again?
--
Erik Trimble
Java System Support
Mailstop: usca14-102
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
Nathan Kroenert
2006-05-31 00:14:58 UTC
Permalink
Post by Erik Trimble
(I'm going to combine Constantine & Eric's replies together, so I
Apology accepted. :)

Anyhoo - What do you think the chances are that any application vendor
is going to write in special handling for Solaris file removal? I'm
guessing slim to none, but have been wrong before...

On the other hand, who hasn't removed something and thought to
themselves "dang. I wish I could get that back right now..."

As a frequent ex-user of the good old Netware Salvage, I can tell you
I'm a real fan of that type of functionality.

Delete something, but it's not actually *really* deleted until we need
that space for something else. At any point, you can fire up the salvage
utility and grab the files back out of that directory.

It was not perfect either, but, it was wayyyy faster than having to get
tapes out...

In the case of taking a snapshot before some mythical event that looks
like it's going to seriously change the system, where do we draw the
line? What do we do when we get something like a

for i in *
do
rm $i
done

You want a 100% solution, but is your 100% solution my, or anyone elses
100% solution?

I for one, would *much* prefer the filesystem to make that stuff
available to me, so regardless of what removed the file, I at least have
a chance to get it back.

Then again, I'd also much prefer that the files be recoverable right up
until we need the space back. Something like what Eric had suggested,
but set the space that the deleted files *cannot* use, so we still
always have 'free' blocks ready for new allocations...

So, as opposed to
# zfs set undelete=1m home/eschrock

something like
# zfs set undelete-queue-size=1m home/eschrock
and
# zfs set undelete-unusable-size=100m home/eschrock

and these two options being mutually exclusive...
Post by Erik Trimble
From an implementation perspective, I'll be interested to see how we get
things back, particularly in the case of multiple directories being
removed, and NOT wanting to blow away the files in the directories that
I (or my app) might have partially reconstructed in the meanwhile...

<fantasy>

/my/directory/important/rubbish # rm -rf .*

"ARGH!"

cd .. (Not there...)

cd /my/directory

zfs undo $PWD/important (Or whatever interface we use!)

"ah. :)"

</fantasy>

Wow. Even thinking about how the ZFS guys might implement that breaks my
head...

Nathan.
Erik Trimble
2006-05-31 04:38:23 UTC
Permalink
Post by Nathan Kroenert
Anyhoo - What do you think the chances are that any application vendor
is going to write in special handling for Solaris file removal? I'm
guessing slim to none, but have been wrong before...
Agreed. However, to this I reply: Who Cares? I'm guessing that 99% of
the possible use of the feature is what people keep talking about, which
is accidentally typing "rm" on something. So, if you fix "rm" and
possibly "nautilus" to use the application-layer RecycleBin, then you've
pretty much satisfied everyone - people like me who don't want unlink(2)
hijacked and don't want most apps to use "undo", and people who want
"oops-proof" 'rm' capability. If we have a standard library which
provides application-layer RecycleBin features, then any app-vendor can
do it if they so choose. I suspect that very few will, but the few who
do will have FILESYSTEM-AGNOISTIC RecycleBin. So we get it for UFS, ZFS,
NFS, Samba, LOFS, and everything else.
Post by Nathan Kroenert
On the other hand, who hasn't removed something and thought to
themselves "dang. I wish I could get that back right now..."
As a frequent ex-user of the good old Netware Salvage, I can tell you
I'm a real fan of that type of functionality.
Delete something, but it's not actually *really* deleted until we need
that space for something else. At any point, you can fire up the salvage
utility and grab the files back out of that directory.
It was not perfect either, but, it was wayyyy faster than having to get
tapes out...
The problem of _easy_ recovery of accidentally deleted files is a SMPSA
(Simple Matter of Proper System Administration). The AFS implementation
at MIT had a nice feature: in everyone's home directory, there was a
.afs/backup directory. It contained what would now be known as a
snapshot of last night's backup. Having that around solved 90% of the
accidental deletion problems, since the main issue usually involves easy
access to the recovery media. This kind of setup is trivial to
configure in the current ZFS.
Post by Nathan Kroenert
In the case of taking a snapshot before some mythical event that looks
like it's going to seriously change the system, where do we draw the
line? What do we do when we get something like a
for i in *
do
rm $i
done
You want a 100% solution, but is your 100% solution my, or anyone elses
100% solution?
The original problem as stated isn't the whole problem domain, so when I
say 100% solution, I mean the solution to the (reasonably restricted)
general case, which in this instance is "I want to be able to recover
any file previously deleted". .

Snapshots aren't a solution to the problem. They're useful as a recovery
strategy, but aren't a solution, and if I implied so, then I didn't mean to.
Post by Nathan Kroenert
I for one, would *much* prefer the filesystem to make that stuff
available to me, so regardless of what removed the file, I at least have
a chance to get it back.
Then again, I'd also much prefer that the files be recoverable right up
until we need the space back. Something like what Eric had suggested,
but set the space that the deleted files *cannot* use, so we still
always have 'free' blocks ready for new allocations...
But, once again, you get into the half-solution of
"some-files-are-available-for-some-time". There are clearly far too many
likely scenarios which will blow through any delete file repository
UNLESS you require that the deleted files NEVER are deleted until
explicitly done so (e.g. "emptying" the Windows Recycle Bin). And,
there again, we're back to 'oh, did I _really_ mean to do that?' And,
people are going to be upset (and complain that the feature doesn't work
right, etc..) if we implement "undo" with some sort of auto expiration
(either limit total size, or whatever). Mainly because they're not
going to understand the limitations.

And, as I pointed out before, it leads to lazy administrative practices.

Turning on an "undo" for everything at the filesystem level is a
nightmare waiting to happen.


-Erik
Eric Schrock
2006-05-30 16:48:27 UTC
Permalink
Post by Erik Trimble
Once again, I hate to be a harpy on this one, but are we really
convinced that having a "undo" (I'm going to call is RecycleBin from now
on) function for file deletion built into ZFS is a good thing?
Since I've seen nothing to the contrary, I'm assuming that we're doing
this by changing the actual effects of an "unlink(2)" sys lib call
against a file in ZFS, and having some other library call added to take
care of actual deletion.
No. The idea is a FIFO queue, bounded in space. There is no explicit
'actual deletion'. Things just pass in one way and out the other once
the space is needed. If you accidentally delete something, you can
quickly go back and get it, but it's not a replacement for regular
snapshots. For example, you might do:

# zfs set undelete=1m home/eschrock

Which would keep just 1 MB of deleted files around, which would allow
for recovery of most useful file types (text files, documents, etc).
Post by Erik Trimble
Even with it being a ZFS option parameter, I can see soooo many places
that it breaks assumptions and causes problems that I can't think it's a
good thing to blindly turn on for everything.
There is no change in assumption. 'rm' will still remove a file, only
that it's deletion may be delayed. This is an optional feature, and is
no different than 'rm' in the face of snapshots. If you don't like it,
don't use the feature (just as you probably wouldn't want snapshots).
There is no "really really remove" command - the point is that the user
doesn't have to think about this, it "just works".
Post by Erik Trimble
And, I've still not seen a good rebuttal to the idea of moving this up
to the Application level, and using a new library to implements the
functionality (and requires Apps to specifically (and explicitly)
support RecycleBin in the design).
I've tried over and over, with several different points:

- doesn't work over NFS/CIFS (recycle bin 'location' may not be
accessible on all hosts, or may require cross-network traffic to
delete a file).
- inherently user-centric, not filesystem-centric (location of stored
file depends on who's doing the deletion)
- requires application changes (which means it can _NEVER_ scale beyond
a handful of apps)
- ignores the predominant need (accidental 'rm')

These are real requirements, whether or not you think they are good.
Post by Erik Trimble
You will notice that Windows does this. The Recycle Bin is usable from
within Windows Explorer, but if you use "del" from a command prompt, it
actually deletes the file. I see no reason why we shouldn't support the
same functionality (i.e. RecycleBin from within Nautilus (as it already
does), and true deletion via "rm").
Except that the whole point of this RFE is that people _want_ 'rm' to
have undelete functionality. The whole world doesn't use
Nautilius/insertapphere. I'm all for having some common 'recycle bin'
functionality (though I think no one will use it beyond a handful of
apps), but that is independent of this RFE.

- Eric

--
Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Tim Foster
2006-05-30 17:07:20 UTC
Permalink
Post by Eric Schrock
- doesn't work over NFS/CIFS (recycle bin 'location' may not be
accessible on all hosts, or may require cross-network traffic to
delete a file).
- inherently user-centric, not filesystem-centric (location of stored
file depends on who's doing the deletion)
Aah right, okay - those are reasons against my previous post about
having an application register it's interest in getting undelete
capability. Good points Eric!

cheers,
tim
--
Tim Foster, Sun Microsystems Inc, Operating Platforms Group
Engineering Operations http://blogs.sun.com/timf
can you guess?
2006-06-11 07:21:16 UTC
Permalink
Interesting thread - a few comments:

Finite-sized validation checksums aren't a 100% solution either, but they're certainly good enough to be extremely useful.

NetApp has built a rather decent business at least in part by providing less-than-100% user-level undo-style facilities via snapshots (not that novel a feature these days, but it was when they introduced it). More recently, 'continuous data protection' products seem to be receiving an enthusiastic response from customers despite their hefty price tags (of course, they *do* purport to be a '100% solution', as long as you're willing to pay for unbounded expansion of storage).

My dim recollection is that TOPS-10 implemented its popular (but again <100%) undelete mechanism using the same kind of 'space-available' approach suggested here. It did, however, support explicit 'delete - I really mean it' facilities to help keep unwanted detritus from shouldering out more desirable bits ('expunge' being the applicable incantation, which had an appropriate ring of finality to it). Tying into user quotas such that one user can't drive another user's most-recently-deleted content out of the system seems implicit in eschrock's comments.

But it is likely that in at least some situations promiscuously retaining *everything* even for a limited time would be a real problem, and that in a lot more it would be at least sub-optimal. Creating a directory attribute inheritable by subdirectories and files controlling temporary undelete-style preservation would help (one could also consider per-file-type controls, though file extensions may not be ideal hooks and I don't know whether ZFS uses file attributes to establish types).

Since this is essentially a per-file mechanism, it really shouldn't require the level of system-wide flush-synchronization that a formal snapshot requires, should it? Especially if it really is limited to preserving deleted files (though it's possible that you could extend it to cover incremental updates as well). If a full-fledged snapshot has too high an overhead to be left to the discretion of common users, that's even more reason to try to implement some form of undelete facility that's lighter in weight.

- bill


This message posted from opensolaris.org
Darren Reed
2006-06-11 20:53:04 UTC
Permalink
Post by can you guess?
But it is likely that in at least some situations promiscuously retaining
*everything*
even for a limited time would be a real problem, and that in a lot more it
would
be at least sub-optimal. Creating a directory attribute inheritable by
subdirectories
and files controlling temporary undelete-style preservation would help (one
could
also consider per-file-type controls, though file extensions may not be
ideal hooks
and I don't know whether ZFS uses file attributes to establish types).
Since this is essentially a per-file mechanism, it really shouldn't
require the level of
system-wide flush-synchronization that a formal snapshot requires, should
it?
Especially if it really is limited to preserving deleted files (though
it's possible that
you could extend it to cover incremental updates as well). If a
full-fledged snapshot
has too high an overhead to be left to the discretion of common users,
that's even
more reason to try to implement some form of undelete facility that's
lighter in weight.
Hmm, I think I'd rather see this built into programs, such as 'rm', rather
than
into the filesystem itself.

For example, if I'm using ZFS for my OpenSolaris development, I might want
to enable this delete-history, just in case I rm a .c file that I need.

But I don't want to keep a history of .o, .a or executable files created,
either.

I want "make clean" or "make clobber" to not cause things to be kept around.

Which brings me to the next point which is to say that there is probably
a need for a "never shanpshot" and "always snapshot" masks for matching
files against.

Darren
Gregory Shaw
2006-06-11 13:52:37 UTC
Permalink
Pardon me if this scenario has been discussed already, but I haven't
seen anything as yet.

I'd like to request a 'zpool evacuate pool <device>' command.
'zpool evacuate' would migrate the data from a disk device to other
disks in the pool.

Here's the scenario:

Say I have a small server with 6x146g disks in a jbod
configuration. If I mirror the system disk with SVM (currently) and
allocate the rest as a non-raidz pool, I end up with 4x146g in a pool
of approximately 548gb capacity.

If one of the disks is starting to fail, I would need to use 'zpool
replace new-disk old-disk'. However, since I have no more slots in
the machine to add a replacement disk, I'm stuck.

This is where a 'zpool evacuate pool <device>' would come in handy.
It would allow me to evacuate the failing device so that it could be
replaced and re-added with 'zpool add pool <device>'.

What does the group think?

Thanks!

-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382 ***@sun.com (work)
Louisville, CO 80028-4382 ***@fmsoft.com (home)
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds
Dick Davies
2006-06-11 15:40:21 UTC
Permalink
Post by Gregory Shaw
Pardon me if this scenario has been discussed already, but I haven't
seen anything as yet.
I'd like to request a 'zpool evacuate pool <device>' command.
'zpool evacuate' would migrate the data from a disk device to other
disks in the pool.
Say I have a small server with 6x146g disks in a jbod
configuration. If I mirror the system disk with SVM (currently) and
allocate the rest as a non-raidz pool, I end up with 4x146g in a pool
of approximately 548gb capacity.
If one of the disks is starting to fail, I would need to use 'zpool
replace new-disk old-disk'. However, since I have no more slots in
the machine to add a replacement disk, I'm stuck.
This is where a 'zpool evacuate pool <device>' would come in handy.
It would allow me to evacuate the failing device so that it could be
replaced and re-added with 'zpool add pool <device>'.
That makes sense to me - seems a good parallel to what
pvmove(8) does on Linux LVM. Useful not just for imminent failure,
but whenever you need to free up a physical partition (you realise you
need to dual-boot a laptop, for example).

I suppose this is only useful in the 'concatenated disk' (~raid0)
case (as you could just pull the disk otherwise).
--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
Eric Schrock
2006-06-11 16:21:34 UTC
Permalink
This only seems valuable in the case of an unreplicated pool. We
already have 'zpool offline' to take a device and prevent ZFS from
talking to it (because it's in the process of failing, perhaps). This
gives you what you want for mirrored and RAID-Z vdevs, since there's no
data to migrate anyway.

We are also planning on implementing 'zpool remove' (for more than just
hot spares), which would allow you to remove an entire toplevel vdev,
migrating the data off of it in the process. This would give you what
you want for the case of an unreplicated pool.

Does this satisfy the usage scenario you described?

- Eric
Post by Gregory Shaw
Pardon me if this scenario has been discussed already, but I haven't
seen anything as yet.
I'd like to request a 'zpool evacuate pool <device>' command.
'zpool evacuate' would migrate the data from a disk device to other
disks in the pool.
Say I have a small server with 6x146g disks in a jbod
configuration. If I mirror the system disk with SVM (currently) and
allocate the rest as a non-raidz pool, I end up with 4x146g in a pool
of approximately 548gb capacity.
If one of the disks is starting to fail, I would need to use 'zpool
replace new-disk old-disk'. However, since I have no more slots in
the machine to add a replacement disk, I'm stuck.
This is where a 'zpool evacuate pool <device>' would come in handy.
It would allow me to evacuate the failing device so that it could be
replaced and re-added with 'zpool add pool <device>'.
What does the group think?
Thanks!
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Gregory Shaw
2006-06-12 04:18:52 UTC
Permalink
Yes, if zpool remove works like you describe, it does the same
thing. Is there a time frame for that feature?

Thanks!
Post by Eric Schrock
This only seems valuable in the case of an unreplicated pool. We
already have 'zpool offline' to take a device and prevent ZFS from
talking to it (because it's in the process of failing, perhaps). This
gives you what you want for mirrored and RAID-Z vdevs, since
there's no
data to migrate anyway.
We are also planning on implementing 'zpool remove' (for more than just
hot spares), which would allow you to remove an entire toplevel vdev,
migrating the data off of it in the process. This would give you what
you want for the case of an unreplicated pool.
Does this satisfy the usage scenario you described?
- Eric
Post by Gregory Shaw
Pardon me if this scenario has been discussed already, but I haven't
seen anything as yet.
I'd like to request a 'zpool evacuate pool <device>' command.
'zpool evacuate' would migrate the data from a disk device to other
disks in the pool.
Say I have a small server with 6x146g disks in a jbod
configuration. If I mirror the system disk with SVM (currently) and
allocate the rest as a non-raidz pool, I end up with 4x146g in a pool
of approximately 548gb capacity.
If one of the disks is starting to fail, I would need to use 'zpool
replace new-disk old-disk'. However, since I have no more slots in
the machine to add a replacement disk, I'm stuck.
This is where a 'zpool evacuate pool <device>' would come in handy.
It would allow me to evacuate the failing device so that it could be
replaced and re-added with 'zpool add pool <device>'.
What does the group think?
Thanks!
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Eric Schrock, Solaris Kernel Development http://blogs.sun.com/
eschrock
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382 ***@sun.com (work)
Louisville, CO 80028-4382 ***@fmsoft.com (home)
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds
Dick Davies
2006-06-27 21:05:11 UTC
Permalink
Just wondered if there'd been any progress in this area?

Correct me if i'm wrong, but as it stands, there's no way
to remove a device you accidentally 'zpool add'ed without
destroying the pool.
Post by Gregory Shaw
Yes, if zpool remove works like you describe, it does the same
thing. Is there a time frame for that feature?
Thanks!
Post by Eric Schrock
This only seems valuable in the case of an unreplicated pool. We
already have 'zpool offline' to take a device and prevent ZFS from
talking to it (because it's in the process of failing, perhaps). This
gives you what you want for mirrored and RAID-Z vdevs, since
there's no
data to migrate anyway.
We are also planning on implementing 'zpool remove' (for more than just
hot spares), which would allow you to remove an entire toplevel vdev,
migrating the data off of it in the process. This would give you what
you want for the case of an unreplicated pool.
Does this satisfy the usage scenario you described?
- Eric
Post by Gregory Shaw
Pardon me if this scenario has been discussed already, but I haven't
seen anything as yet.
I'd like to request a 'zpool evacuate pool <device>' command.
'zpool evacuate' would migrate the data from a disk device to other
disks in the pool.
Say I have a small server with 6x146g disks in a jbod
configuration. If I mirror the system disk with SVM (currently) and
allocate the rest as a non-raidz pool, I end up with 4x146g in a pool
of approximately 548gb capacity.
If one of the disks is starting to fail, I would need to use 'zpool
replace new-disk old-disk'. However, since I have no more slots in
the machine to add a replacement disk, I'm stuck.
This is where a 'zpool evacuate pool <device>' would come in handy.
It would allow me to evacuate the failing device so that it could be
replaced and re-added with 'zpool add pool <device>'.
What does the group think?
--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
Noel Dellofano
2006-06-28 03:59:18 UTC
Permalink
a zpool remove/shrink type function is on our list of features we want
to add.
We have RFE
4852783 reduce pool capacity
open to track this.


Noel
Post by Dick Davies
Just wondered if there'd been any progress in this area?
Correct me if i'm wrong, but as it stands, there's no way
to remove a device you accidentally 'zpool add'ed without
destroying the pool.
Post by Gregory Shaw
Yes, if zpool remove works like you describe, it does the same
thing. Is there a time frame for that feature?
Thanks!
Post by Eric Schrock
This only seems valuable in the case of an unreplicated pool. We
already have 'zpool offline' to take a device and prevent ZFS from
talking to it (because it's in the process of failing, perhaps). This
gives you what you want for mirrored and RAID-Z vdevs, since there's no
data to migrate anyway.
We are also planning on implementing 'zpool remove' (for more than just
hot spares), which would allow you to remove an entire toplevel vdev,
migrating the data off of it in the process. This would give you what
you want for the case of an unreplicated pool.
Does this satisfy the usage scenario you described?
- Eric
Post by Gregory Shaw
Pardon me if this scenario has been discussed already, but I haven't
seen anything as yet.
I'd like to request a 'zpool evacuate pool <device>' command.
'zpool evacuate' would migrate the data from a disk device to other
disks in the pool.
Say I have a small server with 6x146g disks in a jbod
configuration. If I mirror the system disk with SVM (currently) and
allocate the rest as a non-raidz pool, I end up with 4x146g in a pool
of approximately 548gb capacity.
If one of the disks is starting to fail, I would need to use 'zpool
replace new-disk old-disk'. However, since I have no more slots in
the machine to add a replacement disk, I'm stuck.
This is where a 'zpool evacuate pool <device>' would come in handy.
It would allow me to evacuate the failing device so that it could be
replaced and re-added with 'zpool add pool <device>'.
What does the group think?
Robert Milkowski
2006-06-28 14:43:44 UTC
Permalink
Hello Noel,

Wednesday, June 28, 2006, 5:59:18 AM, you wrote:

ND> a zpool remove/shrink type function is on our list of features we want
ND> to add.
ND> We have RFE
ND> 4852783 reduce pool capacity
ND> open to track this.

Is there someone actually working on this right now?
--
Best regards,
Robert mailto:***@task.gda.pl
http://milek.blogspot.com
Noel Dellofano
2006-06-28 21:28:03 UTC
Permalink
Hey Robert,

Well, not yet. Right now our top two priorities are improving
performance in multiple areas of zfs(soon there will be a performance
page tracking progess on the zfs community page), and also getting zfs
boot done. Hence, we're not currently working on heaps of brand new
features. So this is definately on our list, but not currently being
worked on yet.

Noel
Post by Robert Milkowski
Hello Noel,
ND> a zpool remove/shrink type function is on our list of features we want
ND> to add.
ND> We have RFE
ND> 4852783 reduce pool capacity
ND> open to track this.
Is there someone actually working on this right now?
C***@Sun.COM
2006-06-11 14:05:18 UTC
Permalink
Post by Darren Reed
Hmm, I think I'd rather see this built into programs, such as 'rm', rather
than into the filesystem itself.
For example, if I'm using ZFS for my OpenSolaris development, I might want
to enable this delete-history, just in case I rm a .c file that I need.
But I don't want to keep a history of .o, .a or executable files created,
either.
And rm would know this how?

The assumption you make seems to be that .a and .o files are never valuable
where they may be; I believe *BSD used some form of "don't archive this" bit to
achieve this goal; the compiler/linker would set this bit on the files they
created but it would not be automatically copied.
Post by Darren Reed
Which brings me to the next point which is to say that there is probably
a need for a "never shanpshot" and "always snapshot" masks for matching
files against.
I don't see how you can determine this on the basis of the file's name or
contents. You can determine this on the basis of how you got this file;
was it produced by the compiler, assembler, ld, yacc, lex, rpcgen, javac?

Since the number of such progams seems rather small, and the default is that
you want to keep a file, perhaps that is the way forward. Or you could
say that you know that certain sets of processes generate repeatable results,
such as "make" and its children and make would set something inheritable
in the process word which would mark all files created during that processes
as disposible.

Casper
Darren Reed
2006-06-12 03:24:53 UTC
Permalink
Post by C***@Sun.COM
Post by Darren Reed
Hmm, I think I'd rather see this built into programs, such as 'rm', rather
than into the filesystem itself.
For example, if I'm using ZFS for my OpenSolaris development, I might want
to enable this delete-history, just in case I rm a .c file that I need.
But I don't want to keep a history of .o, .a or executable files created,
either.
And rm would know this how?
The assumption you make seems to be that .a and .o files are never valuable
where they may be; I believe *BSD used some form of "don't archive this" bit to
achieve this goal; the compiler/linker would set this bit on the files they
created but it would not be automatically copied.
Post by Darren Reed
Which brings me to the next point which is to say that there is probably
a need for a "never shanpshot" and "always snapshot" masks for matching
files against.
I don't see how you can determine this on the basis of the file's name or
contents. You can determine this on the basis of how you got this file;
was it produced by the compiler, assembler, ld, yacc, lex, rpcgen, javac?
Since the number of such progams seems rather small, and the default is that
you want to keep a file, perhaps that is the way forward. Or you could
say that you know that certain sets of processes generate repeatable results,
such as "make" and its children and make would set something inheritable
in the process word which would mark all files created during that processes
as disposible.
I think the idea you've suggested here, setting an extra
bit or property in on the file as a part of the work flow
is a better idea than the one I had in mind.

Is passing a new flag through open(2) a way to achieve this,
such as O_DISPOSABLE (that is ignored by filesystems that
don't have any way to handle it), or should tools such as
make check to see if they're creating a file on ZFS and set
the extra bit appropriately with another system call?

Darren
Darren J Moffat
2006-06-12 08:38:03 UTC
Permalink
Post by Darren Reed
I think the idea you've suggested here, setting an extra
bit or property in on the file as a part of the work flow
is a better idea than the one I had in mind.
Which is I believe covered by one or more of the following CRs:

5105713 want new security attributes on files
6417435 DOS attributes and additional timestamps to support for CIFS
4058737 RFE: new attributes for ufs
Obviously this needs a ZFS equivalent.
Post by Darren Reed
Is passing a new flag through open(2) a way to achieve this,
such as O_DISPOSABLE (that is ignored by filesystems that
don't have any way to handle it), or should tools such as
make check to see if they're creating a file on ZFS and set
the extra bit appropriately with another system call?
Or use acl(2) interface or openat(2) interface depending on how these
are implemented.
--
Darren J Moffat
David Magda
2006-06-11 19:45:00 UTC
Permalink
Post by can you guess?
My dim recollection is that TOPS-10 implemented its popular (but
again <100%) undelete mechanism using the same kind of 'space-
available' approach suggested here. It did, however, support
explicit 'delete - I really mean it' facilities to help keep
unwanted detritus from shouldering out more desirable bits
('expunge' being the applicable incantation, which had an
appropriate ring of finality to it). Tying into user quotas such
that one user can't drive another user's most-recently-deleted
content out of the system seems implicit in eschrock's comments.
Venti is a network storage system that permanently stores data
blocks. A 160-bit SHA-1 hash of the data (called score by Venti)
acts as the address of the data. This enforces a write-once policy
since no other data block can be found with the same address. The
addresses of multiple writes of the same data are identical, so
duplicate data is easily identified and the data block is stored
only once. Data blocks cannot be removed, making it ideal for
permanent or backup storage. Venti is typically used with Fossil to
provide a file system with permanent snapshots.
Fossil is the default file system in Plan 9 from Bell Labs. It
serves the network protocol 9P and runs as a user space daemon,
like most Plan 9 file servers. Fossil is different from most other
file system due to its snapshot/archival feature. It can take
snapshots of the entire file system on command or with an interval.
These snapshots can be kept on the Fossil partition as long as disk
space allows; if the partition fills up old snapshots will be
removed to free up disk space. A snapshot can also be saved
permanently to Venti. Fossil and Venti are typically installed
together.
[1] http://en.wikipedia.org/wiki/Venti
[2] http://en.wikipedia.org/wiki/Fossil_%28file_system%29
Continue reading on narkive:
Loading...