David - Musings of an SRE

Diagnosing too many open files

When facing a “Too Many Open Files” error, it is normally a result of a process taking up too many file descriptors.

To diagnose this, find out the PID of the process which gave out the error.

You can do this with a:

$ ps auwx | grep <name_of_process>

Verify open file limit per process

πŸ“ Note: Each process that runs have a number of open files limit. By default, most of the time the limit is 1024

$ ulimit -a

root@relaye:/home/davidc# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 15428
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 15428
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Key value to look for is open files

πŸ“ A shortcut you could use is ulimit -n

Verify the limit of the process that is running

Sometimes a process could have a different limit as the one that is returned by ulimit -n. This is because there exist syscalls that can change a running process’s open files limit (citation needed)

To check for a running process’ actual limit:

$ cat /proc/<pid>/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             15428                15428                processes
Max open files            1024                 524288               files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       15428                15428                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

Look for the value at Max open files

πŸ“ What is Soft Limit and Hard Limit?

Soft limit is the value that a non-root user can change.

Hard limit is the maximum it can go to. Only root users can change this value.

Verify the number of file descriptors opened

There are two ways to do this.

$ ls /proc/<pid>/fd | wc -l
1024
$ lsof -p <pid> | wc -l
1028

I would recommend using lsof -p <pid> as it will tell you what exactly is the file being opened, so that you can trace and track.

References