cannot delete file when fs 100% full

Discussion:

Paul Raines

2008-05-30 14:43:48 UTC

It seems when a zfs filesystem with reserv/quota is 100% full users can no
longer even delete files to fix the situation getting errors like these:

$ rm rh.pm6895.medial.V2.tif
rm: cannot remove `rh.pm6895.medial.V2.tif': Disk quota exceeded

(this is over NFS from a RHEL4 Linux box)

I can log in as root on the Sun server and delete the file as root.
After doing that, the user can then delete files okay.

Is there anyway to workaround this does not involve root intervention?
Users are filling up their volumes all the time which is the
reason they must have reserv/quota set.

--
---------------------------------------------------------------
Paul Raines email: raines at nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street Charlestown, MA 02129 USA

Joe Little

2008-05-30 15:19:11 UTC

Permalink

Post by Paul Raines
It seems when a zfs filesystem with reserv/quota is 100% full users can no
$ rm rh.pm6895.medial.V2.tif
rm: cannot remove `rh.pm6895.medial.V2.tif': Disk quota exceeded
(this is over NFS from a RHEL4 Linux box)
I can log in as root on the Sun server and delete the file as root.
After doing that, the user can then delete files okay.
Is there anyway to workaround this does not involve root intervention?
Users are filling up their volumes all the time which is the
reason they must have reserv/quota set.

Well, with the Copy-on-right filesystem a delete actually requires a
write. That said, there have been certain religious arguments on the
list about whether the "quota" support presented by ZFS is sufficient.
In a nutshell, per user quotas are not implemented, and the suggested
workaround is the per-user filesystem with quota/reservations. Its
inelegant at best since the auto-mount definitions become their own
pain to maintain.

The other unimplemented feature is the soft and hard quota limits.
Most people have gotten around this by actually presenting only UFS
volumes held inside ZFS zvols to end users, but that defeats the
purpose of providing snapshots directly to end users, etc.

However, since snapshots are only available at the filesystem level,
you still are restricted to one filesystem per user to use snapshots
well, but I would argue hard/soft limits on the quota are the
unanswered problem that doesn't have a known workaround.

Post by Paul Raines
--
---------------------------------------------------------------
Paul Raines email: raines at nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street Charlestown, MA 02129 USA
_______________________________________________
zfs-discuss mailing list
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Paul Raines

2008-08-14 20:25:45 UTC

Permalink

This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

I normally create zfs fs's like this:

zfs create -o quota=131G -o reserv=131G -o recsize=8K zpool1/newvol

and then just nfs export through /etc/dfs/dfstab. We deal with lots
of small image files in our MRI data which is the reason for small
recsize.

Post by Joe Little

Well, with the Copy-on-right filesystem a delete actually requires a
write. That said, there have been certain religious arguments on the
list about whether the "quota" support presented by ZFS is sufficient.
In a nutshell, per user quotas are not implemented, and the suggested
workaround is the per-user filesystem with quota/reservations. Its
inelegant at best since the auto-mount definitions become their own
pain to maintain.
The other unimplemented feature is the soft and hard quota limits.
Most people have gotten around this by actually presenting only UFS
volumes held inside ZFS zvols to end users, but that defeats the
purpose of providing snapshots directly to end users, etc.
However, since snapshots are only available at the filesystem level,
you still are restricted to one filesystem per user to use snapshots
well, but I would argue hard/soft limits on the quota are the
unanswered problem that doesn't have a known workaround.

Tomas Ögren

2008-08-14 22:35:34 UTC

Permalink

Post by Paul Raines
This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

Sun is sending me an IDR this/next week regarding this bug..

/Tomas

--
Tomas Ögren, ***@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se

Michael Schuster

2008-08-29 13:59:07 UTC

Permalink

Post by Tomas Ãgren

Post by Paul Raines
This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

Sun is sending me an IDR this/next week regarding this bug..

It seems to work, but I am unfortunately not allowed to pass this IDR

IDR are "point patches", built against specific kernel builds (IIRC) and as
such not intended for a wider distribution. Therefore they need to be
tracked so they can be replaced with the proper patch once that is available.
If you believe you need the IDR, you need to get in touch with your local
services organisation and ask them to get it to you - they know the proper
procedures to make sure you get one that works on your machine(s) and that
you also get the patch once it's available.

HTH
Michael

--
Michael Schuster http://blogs.sun.com/recursion
Recursion, n.: see 'Recursion'

Sanjeev

2008-08-29 15:56:53 UTC

Permalink

Thanks Michael for the clarification about the IDR ! :-)
I was planing to give this explaination myself.

The fix I have in there is a temporary fix.
I am currently looking at a better way of accounting the
fatzap blocks to make sure we cover all the cases.
I have got some pointers from Mark Maybee and am looking into it
right now.

Thanks and regards,
Sanjeev.

Post by Michael Schuster

Post by Tomas Ãgren

Post by Paul Raines
This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

Sun is sending me an IDR this/next week regarding this bug..

It seems to work, but I am unfortunately not allowed to pass this IDR

Tomas Ögren

2008-08-29 11:09:09 UTC

Permalink

Post by Tomas Ãgren

Post by Paul Raines
This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

Sun is sending me an IDR this/next week regarding this bug..

It seems to work, but I am unfortunately not allowed to pass this IDR
on. Temporary patch (redistributable) will surface soon and a real patch
in 3-6 weeks (Sun Eng estimate).

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6664765

/Tomas

--
Tomas Ögren, ***@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se

Shawn Ferry

2008-08-29 17:08:20 UTC

Permalink

Post by Paul Raines
This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

I had this problem in the past. Fortunately I was able to recover by
removing an old snapshot which gave me enough room to deal with my
problems.

Now, I create a fs called reserved and set a small reservation to
ensure that there is a small amount of space available.

[sferry<@>noroute(0) 12:59 s001]
</Users/sferry>
[6] zfs get reservation,mountpoint,canmount,type noroute/reserved
NAME PROPERTY VALUE SOURCE
noroute/reserved reservation 50M local
noroute/reserved mountpoint none inherited from noroute
noroute/reserved canmount off local
noroute/reserved type filesystem -

If I fill the pool now, I reduce the reservation (reduce instead of
remove in case I have something writing uncontrollably to the pool)
and clean up.

Shawn

--
Shawn Ferry shawn.ferry at sun.com
Senior Primary Systems Engineer
Sun Managed Operations

Paul Raines

2008-08-29 21:59:43 UTC

Permalink

Post by Paul Raines
This problem is becoming a real pain to us again and I was wondering
if there has been in the past few month any known fix or workaround.

I had this problem in the past. Fortunately I was able to recover by removing
an old snapshot which gave me enough room to deal with my problems.
Now, I create a fs called reserved and set a small reservation to ensure that
there is a small amount of space available.
</Users/sferry>
[6] zfs get reservation,mountpoint,canmount,type noroute/reserved
NAME PROPERTY VALUE SOURCE
noroute/reserved reservation 50M local
noroute/reserved mountpoint none inherited from noroute
noroute/reserved canmount off local
noroute/reserved type filesystem -
If I fill the pool now, I reduce the reservation (reduce instead of remove in
case I have something writing uncontrollably to the pool) and clean up.

When this problem happens to us, I have no problem deleting a file as root
to get things back on track. It is just that normal users cannot delete
(who are accessing only over NFS). As soon as I delete a file as root,
then normal users can start deleting things themselves.

Robert Milkowski

2008-08-18 08:42:54 UTC

Permalink

Hello Paul,

Thursday, August 14, 2008, 9:25:45 PM, you wrote:

PR> This problem is becoming a real pain to us again and I was wondering
PR> if there has been in the past few month any known fix or workaround.

PR> I normally create zfs fs's like this:

PR> zfs create -o quota=131G -o reserv=131G -o recsize=8K zpool1/newvol

PR> and then just nfs export through /etc/dfs/dfstab. We deal with lots
PR> of small image files in our MRI data which is the reason for small
PR> recsize.

If you are creating lots of small files and you are not worried about
accessing only small parts of them with a smaller block, then it
doesn't really make sense to reduce record size. With standard 128K
recsize if you create a small file, lets say 8K in size, the block
size used by zfs will be actually 8K.

Unless you are doing this to work-around space maps issue.

--
Best regards,
Robert Milkowski mailto:***@task.gda.pl
http://milek.blogspot.com

Paul Raines

2008-08-18 13:54:05 UTC

Permalink

I read that that should be case but did not see such in practice. I
created one volume without the recsize setting and one with. Than
copied the same data to both (lots of small files). The 'du' report
on the one without the recsize was significantly bigger than the
one where I made it and in fact filled up before the other volume.

Post by Robert Milkowski
Hello Paul,
PR> This problem is becoming a real pain to us again and I was wondering
PR> if there has been in the past few month any known fix or workaround.
PR> zfs create -o quota=131G -o reserv=131G -o recsize=8K zpool1/newvol
PR> and then just nfs export through /etc/dfs/dfstab. We deal with lots
PR> of small image files in our MRI data which is the reason for small
PR> recsize.
If you are creating lots of small files and you are not worried about
accessing only small parts of them with a smaller block, then it
doesn't really make sense to reduce record size. With standard 128K
recsize if you create a small file, lets say 8K in size, the block
size used by zfs will be actually 8K.
Unless you are doing this to work-around space maps issue.

tanisha singh

2010-11-16 19:35:52 UTC

Permalink

Hi. I runned into that damn problem too. And after days of searching I finally found this software: Delete Long Path File Tool.

It's GREAT. You can find it here: <a href="http://www.deletelongfile.com">www.deletelongfile.com</a>

--
This message posted from opensolaris.org

Lance

2008-05-31 00:28:25 UTC

Permalink

Post by Paul Raines
It seems when a zfs filesystem with reserv/quota is
100% full users can no
longer even delete files to fix the situation getting
$ rm rh.pm6895.medial.V2.tif
rm: cannot remove `rh.pm6895.medial.V2.tif': Disk
quota exceeded

We've run into the same problem here. It's a known problem first mentioned on the ZFS forum in July 2006 and remains unfixed even in Solaris 10 Update 5. The only workaround we found is to truncate the file using something like "cat /dev/null >! file" (for tcsh) since that doesn't trigger copy-on-write. Unfortunately, it may not be easy to train all users to do that.

Note: we have one large zpool on an X4500 Thumper and discovered that even truncation won't work if the top-level file system 1) has a manually declared quota and 2) is full. So we had to leave the quota turned off at the top level and slightly undercommit the total disk space in user file systems so truncation shell syntax continued to work for NFS users.

This message posted from opensolaris.org

Continue reading on narkive:

Search results for 'cannot delete file when fs 100% full' (Questions and Answers)

replies

Memory Stick says it's full, when it actually isn't?

started 2009-06-02 03:50:57 UTC

hardware

replies

What to do when Disk C is full and how to format Disk D.?

started 2010-09-14 03:25:00 UTC

hardware

replies

How come I cannot copy this file even though I have lots of space on the hard drive and the destination drive?