diff options
author | Massimiliano Culpo <massimiliano.culpo@gmail.com> | 2020-06-05 09:08:32 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-06-05 00:08:32 -0700 |
commit | 5b272e3ff3f7cac83d4e3db402781f535950d26f (patch) | |
tree | dc15a4133c625ec2157fcb8ca1b25d4c3f551a57 /.gitattributes | |
parent | 92e24950e5cbe28a3b241b13e2dc5363c7b994d7 (diff) | |
download | spack-5b272e3ff3f7cac83d4e3db402781f535950d26f.tar.gz spack-5b272e3ff3f7cac83d4e3db402781f535950d26f.tar.bz2 spack-5b272e3ff3f7cac83d4e3db402781f535950d26f.tar.xz spack-5b272e3ff3f7cac83d4e3db402781f535950d26f.zip |
commands: use a single ThreadPool for `spack versions` (#16749)
This fixes a fork bomb in `spack versions`. Recursive generation of pools
to scrape URLs in `_spider` was creating large numbers of processes.
Instead of recursively creating process pools, we now use a single
`ThreadPool` with a concurrency limit.
More on the issue: having ~10 users running at the same time spack
versions on front-end nodes caused kernel lockup due to the high number
of sockets opened (sys-admin reports ~210k distributed over 3 nodes).
Users were internal, so they had ulimit -n set to ~70k.
The forking behavior could be observed by just running:
$ spack versions boost
and checking the number of processes spawned. Number of processes
per se was not the issue, but each one of them opens a socket
which can stress `iptables`.
In the original issue the kernel watchdog was reporting:
Message from syslogd@login03 at May 19 12:01:30 ...
kernel:Watchdog CPU:110 Hard LOCKUP
Message from syslogd@login03 at May 19 12:01:31 ...
kernel:watchdog: BUG: soft lockup - CPU#110 stuck for 23s! [python3:2756]
Message from syslogd@login03 at May 19 12:01:31 ...
kernel:watchdog: BUG: soft lockup - CPU#94 stuck for 22s! [iptables:5603]
Diffstat (limited to '.gitattributes')
0 files changed, 0 insertions, 0 deletions