Scope of the topic
This topic helps troubleshooting a quota issue due to too much space usage.
It does not consider inode quota (total number of files), this will be part of another topic.
First step: find out how much is used exactly
The df-ulhpc
command displays the quota of all your main folders (home, scratch, projects…). The command can only be run on the access nodes, i.e. not on a worker node.
After running the df-ulhpc
command, I see this:
Directory Used Soft quota Hard quota Grace period
--------- ---- ---------- ---------- ------------
/home/users/jschleich 281.3G 500G 550G none
/mnt/lscratch/ 2.967G 10T 11T none
/work/projects/dummy 1.074T 1000G 1.074T 29hours
Here we can see that my dummy
project is using too much space and that explains why I can no longer write new files.
For more information about quotas, check our related ULHPC documentation.
Soft quota, hard quota and grace period
On the ULHPC there is a tolerance when you exceeds your quota which is represented by the soft quota here. You can write new files as long as you do not reach the hard quota and for as long as 7 days.
Second step: find out which files are taking a lot of space
Now that I know that my projects uses too much space, let see if this is due to some big files.
Top 100 biggest files
The following command will list all the files (-type f
) of the provided directory and for each of them will evaluate their size in a human readable form (du -h
), sort them in a reverse order (sort -h -r
) and then display the top 100 biggest file (head -n 100
).
# Change /work/projects/dummy for the folder you want to consider
# Change 100 to whatever is relevant to you
# Here type -f means files only (not folders)
find /work/projects/dummy/ -type f -exec du -h {} + | sort -h -r | head -n 100
Top 100 biggest files but excluding a given pattern
You tried the previous command however most of the big files of the list cannot be deleted because you still need them for some reason. If those files have have either their extension or their filename pattern in common, you can filter them out. With the following command you can filter them out files with the .bam
extension:
# Change /work/projects/dummy for the folder you want to consider
# Change 100 to whatever is relevant to you
# Here ! -name "pattern" means NOT with a name that match the pattern
find /work/projects/dummy -type f ! -name "*.bam" -exec du -h {} + | sort -h -r | head -n 100
Alternatively, if you already know a filename pattern / extension that can be deleted, you can use the following script:
# Change /work/projects/dummy for the folder you want to consider
# Change 100 to whatever is relevant to you
# Here ! -name "pattern" means NOT with a name that match the pattern
find /work/projects/dummy -type f ! -name "*.bam" -exec du -h {} + | sort -h -r | head -n 100
Top 100 biggest folders
You checked the top 100 biggest files but none of them is very big. A possible explanation is that you have a lot of small files and their size adds up above the quota.
Let’s find out which folders are the biggest, i.e. cumulated size their files and sub-folders. You can do that with the following command:
# Change /work/projects/dummy for the folder you want to consider
# Change 100 to whatever is relevant to you
# Here type -d means folders only (not files)
find /work/projects/dummy -type d -exec du -h {} + | sort -h -r | head -n 100
Note that the result will contain folders containing sub-folders which are large. For example, if folder /work/projects/dummy/folder/subfolder
is in the top 100 then its parent folder (/work/projects/dummy/folder
) will also be included in the list as it is at least as big.
Third step: cleaning time
This section provides a collection of snippets to delete files based on some condition. Please ensure that you double-checked before deleting files.
Delete files based on a file pattern
# Here all files with the bam extension will be deleted
find /work/projects/dummy -type f -name "*.bam" -exec rm -f {} \;
Delete files based on a maximum file size
# Here all files above 1Go will be deleted
find /work/projects/dummy -type f -size +1G -exec rm -f {} \;