summaryrefslogtreecommitdiff
path: root/lib/spack/llnl/util/lock.py
AgeCommit message (Collapse)AuthorFilesLines
2024-03-11Remove dead code (#43114)Massimiliano Culpo1-4/+0
* Remove dead code in spack * Remove dead code in llnl
2024-01-02Update copyright year to 2024 (#41919)Todd Gamblin1-1/+1
It was time to run `spack license update-copyright-year` again.
2023-09-28Partial removal of circular dependencies between `spack` and `llnl` (#40090)Massimiliano Culpo1-2/+2
Modifications: - [x] Move `spack.util.string` to `llnl.string` - [x] Remove dependency of `llnl` on `spack.error` - [x] Move path of `spack.util.path` to `llnl.path` - [x] Move `spack.util.environment.get_host_*` to `spack.spec`
2023-08-08Fix broken inode assertion (#39188)Harmen Stoppels1-1/+1
2023-07-20spack.util.lock: add type-hints, remove **kwargs in method signatures (#39011)Massimiliano Culpo1-0/+1
2023-07-19llnl.util.lock: add type-hints (#38977)Massimiliano Culpo1-64/+105
Also uppercase global variables in the module
2023-07-05Drop Python 2 super syntax (#38718)Adam J. Stewart1-5/+5
2023-07-05Drop Python 2 object subclassing (#38720)Adam J. Stewart1-5/+5
2023-01-18license year bump (#34921)Harmen Stoppels1-1/+1
* license bump year * fix black issues of modified files * mypy * fix 2021 -> 2023
2022-11-25Track locks by (dev, ino); close file handlers between tests (#34122)Harmen Stoppels1-18/+16
2022-10-24locks: improved errors (#33477)Harmen Stoppels1-4/+18
Instead of showing ``` ==> Error: Timed out waiting for a write lock. ``` show ``` ==> Error: Timed out waiting for a write lock after 1.200ms and 4 attempts on file: /some/file ``` s.t. we actually get to see where acquiring a lock failed even when not running in debug mode. And use pretty time units everywhere, so we don't get 1.45e-9 seconds but 1.450ns etc.
2022-09-06Fix spack locking on some NFS systems (#32426)Seth R. Johnson1-2/+6
Co-authored-by: Todd Gamblin <tgamblin@llnl.gov>
2022-07-31black: reformat entire repository with blackTodd Gamblin1-79/+85
2022-03-17Windows Support: Testing Suite integrationJohn Parent1-0/+7
Broaden support for execution of the test suite on Windows. General bug and review fixups
2022-03-17"spack commands --update-completion"John Parent1-17/+42
2022-01-14Update copyright year to 2022Todd Gamblin1-1/+1
2021-08-24locks: only open lockfiles once instead of for every lock held (#24794)Todd Gamblin1-20/+128
This adds lockfile tracking to Spack's lock mechanism, so that we ensure that there is only one open file descriptor per inode. The `fcntl` locks that Spack uses are associated with an inode and a process. This is convenient, because if a process exits, it releases its locks. Unfortunately, this also means that if you close a file, *all* locks associated with that file's inode are released, regardless of whether the process has any other open file descriptors on it. Because of this, we need to track open lock files so that we only close them when a process no longer needs them. We do this by tracking each lockfile by its inode and process id. This has several nice properties: 1. Tracking by pid ensures that, if we fork, we don't inadvertently track the parent process's lockfiles. `fcntl` locks are not inherited across forks, so we'll just track new lockfiles in the child. 2. Tracking by inode ensures that referencs are counted per inode, and that we don't inadvertently close a file whose inode still has open locks. 3. Tracking by both pid and inode ensures that we only open lockfiles the minimum number of times necessary for the locks we have. Note: as mentioned elsewhere, these locks aren't thread safe -- they're designed to work in Python and assume the GIL. Tasks: - [x] Introduce an `OpenFileTracker` class to track open file descriptors by inode. - [x] Reference-count open file descriptors and only close them if they're no longer needed (this avoids inadvertently releasing locks that should not be released).
2021-07-16API Docs: fix broken reference targetsAdam J. Stewart1-8/+18
2021-07-08imports: sort imports everywhere in Spack (#24695)Todd Gamblin1-4/+4
* fix remaining flake8 errors * imports: sort imports everywhere in Spack We enabled import order checking in #23947, but fixing things manually drives people crazy. This used `spack style --fix --all` from #24071 to automatically sort everything in Spack so PR submitters won't have to deal with it. This should go in after #24071, as it assumes we're using `isort`, not `flake8-import-order` to order things. `isort` seems to be more flexible and allows `llnl` mports to be in their own group before `spack` ones, so this seems like a good switch.
2021-04-15Use `gethostname()` instead of `getfqdn()` for lock debug modevsoch1-1/+1
In debug mode, processes taking an exclusive lock write out their node name to the lock file. We were using `getfqdn()` for this, but it seems to produce inconsistent results when used from within some github actions containers. We get this error because getfqdn() seems to return a short name in one place and a fully qualified name in another: ``` File "/home/runner/work/spack/spack/lib/spack/spack/test/llnl/util/lock.py", line 1211, in p1 assert lock.host == self.host AssertionError: assert 'fv-az290-764....cloudapp.net' == 'fv-az290-764' - fv-az290-764.internal.cloudapp.net + fv-az290-764 !!!!!!!!!!!!!!!!!!!! Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!! == 1 failed, 2547 passed, 7 skipped, 22 xfailed, 2 xpassed in 1238.67 seconds == ``` This seems to stem from https://bugs.python.org/issue5004. We don't really need to get a fully qualified hostname for debugging, so use `gethostname()` because its results are more consistent. This seems to fix the issue. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
2021-01-02copyrights: update all files with license headers for 2021Todd Gamblin1-1/+1
- [x] add `concretize.lp`, `spack.yaml`, etc. to licensed files - [x] update all licensed files to say 2013-2021 using `spack license update-copyright-year` - [x] appease mypy with some additions to package.py that needed for oneapi.py
2020-07-23Reduce output verbosity with debug levels (#17546)Tamara Dahlgren1-36/+26
* switch from bool to int debug levels * Added debug options and changed lock logging to use more detailed values * Limit installer and timestamp PIDs to standard debug output * Reduced verbosity of fetch/stage/install output, changing most to debug level 1 * Combine lock log methods; change build process install to debug * Changed binary cache install messages to extraction messages
2020-02-19Distributed builds (#13100)Tamara Dahlgren1-43/+218
Fixes #9394 Closes #13217. ## Background Spack provides the ability to enable/disable parallel builds through two options: package `parallel` and configuration `build_jobs`. This PR changes the algorithm to allow multiple, simultaneous processes to coordinate the installation of the same spec (and specs with overlapping dependencies.). The `parallel` (boolean) property sets the default for its package though the value can be overridden in the `install` method. Spack's current parallel builds are limited to build tools supporting `jobs` arguments (e.g., `Makefiles`). The number of jobs actually used is calculated as`min(config:build_jobs, # cores, 16)`, which can be overridden in the package or on the command line (i.e., `spack install -j <# jobs>`). This PR adds support for distributed (single- and multi-node) parallel builds. The goals of this work include improving the efficiency of installing packages with many dependencies and reducing the repetition associated with concurrent installations of (dependency) packages. ## Approach ### File System Locks Coordination between concurrent installs of overlapping packages to a Spack instance is accomplished through bottom-up dependency DAG processing and file system locks. The runs can be a combination of interactive and batch processes affecting the same file system. Exclusive prefix locks are required to install a package while shared prefix locks are required to check if the package is installed. Failures are communicated through a separate exclusive prefix failure lock, for concurrent processes, combined with a persistent store, for separate, related build processes. The resulting file contains the failing spec to facilitate manual debugging. ### Priority Queue Management of dependency builds changed from reliance on recursion to use of a priority queue where the priority of a spec is based on the number of its remaining uninstalled dependencies. Using a queue required a change to dependency build exception handling with the most visible issue being that the `install` method *must* install something in the prefix. Consequently, packages can no longer get away with an install method consisting of `pass`, for example. ## Caveats - This still only parallelizes a single-rooted build. Multi-rooted installs (e.g., for environments) are TBD in a future PR. Tasks: - [x] Adjust package lock timeout to correspond to value used in the demo - [x] Adjust database lock timeout to reduce contention on startup of concurrent `spack install <spec>` calls - [x] Replace (test) package's `install: pass` methods with file creation since post-install `sanity_check_prefix` will otherwise error out with `Install failed .. Nothing was installed!` - [x] Resolve remaining existing test failures - [x] Respond to alalazo's initial feedback - [x] Remove `bin/demo-locks.py` - [x] Add new tests to address new coverage issues - [x] Replace built-in package's `def install(..): pass` to "install" something (i.e., only `apple-libunwind`) - [x] Increase code coverage
2019-12-30copyright: update copyright dates for 2020 (#14328)Todd Gamblin1-1/+1
2019-12-23lock transactions: avoid redundant reading in write transactionsTodd Gamblin1-1/+6
Our `LockTransaction` class was reading overly aggressively. In cases like this: ``` 1 with spack.store.db.read_transaction(): 2 with spack.store.db.write_transaction(): 3 ... ``` The `ReadTransaction` on line 1 would read in the DB, but the WriteTransaction on line 2 would read in the DB *again*, even though we had a read lock the whole time. `WriteTransaction`s were only considering nested writes to decide when to read, but they didn't know when we already had a read lock. - [x] `Lock.acquire_write()` return `False` in cases where we already had a read lock.
2019-12-23lock transactions: ensure that nested write transactions writeTodd Gamblin1-1/+7
If a write transaction was nested inside a read transaction, it would not write properly on release, e.g., in a sequence like this, inside our `LockTransaction` class: ``` 1 with spack.store.db.read_transaction(): 2 with spack.store.db.write_transaction(): 3 ... 4 with spack.store.db.read_transaction(): ... ``` The WriteTransaction on line 2 had no way of knowing that its `__exit__()` call was the last *write* in the nesting, and it would skip calling its write function. The `__exit__()` call of the `ReadTransaction` on line 1 wouldn't know how to write, and the file would never be written. The DB would be correct in memory, but the `ReadTransaction` on line 4 would re-read the whole DB assuming that other processes may have modified it. Since the DB was never written, we got stale data. - [x] Make `Lock.release_write()` return `True` whenever we release the *last write* in a nest.
2019-12-23lock transactions: fix non-transactional writesTodd Gamblin1-32/+68
Lock transactions were actually writing *after* the lock was released. The code was looking at the result of `release_write()` before writing, then writing based on whether the lock was released. This is pretty obviously wrong. - [x] Refactor `Lock` so that a release function can be passed to the `Lock` and called *only* when a lock is really released. - [x] Refactor `LockTransaction` classes to use the release function instead of checking the return value of `release_read()` / `release_write()`
2019-01-01copyright: update license headers for 2013-2019 copyright.Todd Gamblin1-1/+1
2018-10-17relicense: replace LGPL headers with Apache-2.0/MIT SPDX headersTodd Gamblin1-23/+4
- remove the old LGPL license headers from all files in Spack - add SPDX headers to all files - core and most packages are (Apache-2.0 OR MIT) - a very small number of remaining packages are LGPL-2.1-only
2018-09-25Increase and customize lock timeouts (#9219)Peter Scheibel1-71/+133
Fixes #9166 This is intended to reduce errors related to lock timeouts by making the following changes: * Improves error reporting when acquiring a lock fails (addressing #9166) - there is no longer an attempt to release the lock if an acquire fails * By default locks taken on individual packages no longer have a timeout. This allows multiple spack instances to install overlapping dependency DAGs. For debugging purposes, a timeout can be added by setting 'package_lock_timeout' in config.yaml * Reduces the polling frequency when trying to acquire a lock, to reduce impact in the case where NFS is overtaxed. A simple adaptive strategy is implemented, which starts with a polling interval of .1 seconds and quickly increases to .5 seconds (originally it would poll up to 10^5 times per second). A test is added to check the polling interval generation logic. * The timeout for Spack's whole-database lock (e.g. for managing information about installed packages) is increased from 60s to 120s * Users can configure the whole-database lock timeout using the 'db_lock_timout' setting in config.yaml Generally, Spack locks (those created using spack.llnl.util.lock.Lock) now have no timeout by default This does not address implementations of NFS that do not support file locking, or detect cases where services that may be required (nfslock/statd) aren't running. Users may want to be able to more-aggressively release locks when they know they are the only one using their Spack instance, and they encounter lock errors after a crash (e.g. a remote terminal disconnect mentioned in #8915).
2018-07-21locks: fix bug when creating lockfiles in the current directory.Todd Gamblin1-0/+5
- Fixes a bug in `llnl.util.lock` - Locks in the current directory would fail because the parent directory was the empty string. - Fix this and return '.' for the parent of locks in the current directory.
2018-07-12locks: improve errors and permission checkingTodd Gamblin1-29/+58
- Clean up error messages for when a lock can't be created, or when an exclusive (write) lock can't be taken on a file. - Add a number of subclasses of LockError to distinguish timeouts from permission issues. - Add an explicit check to prevent the user from taking a write lock on a read-only file. - We had a check for this for when we try to *upgrade* a lock on an RO file, but not for an initial write lock attempt. - Add more tests for different lock permission scenarios.
2018-07-12locks: llnl.util.lock now only writes host info when in debug modeTodd Gamblin1-9/+18
- write locks previously wrote information about the lock holder (host and pid), and read locks woudl read this in. - This is really only for debugging, so only enable it then - add some tests that target debug info, and improve multiproc lock test output
2018-05-18locks: add configuration and command-line options to enable/disable locks ↵Todd Gamblin1-11/+19
(#7692) - spack.util.lock behaves the same as llnl.util.lock, but Lock._lock and Lock._unlock do nothing. - can be disabled with a control variable. - configuration options can enable/disable locking: - `locks` option in spack configuration controls whether Spack will use filesystem locks or not. - `-l` and `-L` command-line options can force-disable or force-enable locking. - Spack will check for group- and world-writability before disabling locks, and it will not allow a group- or world-writable instance to have locks disabled. - update documentation
2018-03-24Update copyright on LLNL files for 2018. (#7592)Todd Gamblin1-1/+1
2017-11-04Replace github.com/llnl/spack with github.com/spack/spack (#6142)Todd Gamblin1-1/+1
We moved to a new GitHub org! Now make the code and docs reflect that.
2017-09-06Update copyright notices for 2017 (#5295)Michael Kuhn1-1/+1
2017-07-04Parametrized lock test and make it work with MPITodd Gamblin1-2/+7
- Lock test can be run either as a node-local test or as an MPI test. - Lock test is now parametrized by filesystem, so you can test the locking capabilities of your NFS, Lustre, or GPFS filesystem. See docs for details.
2017-06-24Make LICENSE recognizable by GitHub. (#4598)Todd Gamblin1-1/+1
2017-04-25Add API Docs for lib/spack/llnl (#3982)Adam J. Stewart1-10/+10
* Add API Docs for lib/spack/llnl * Clean up after previous builds * Better fix for purging API docs
2016-10-11Fix bug with lock upgrades.Todd Gamblin1-12/+16
- Closing and re-opening to upgrade to write will lose all existing read locks on this process. - If we didn't allow ranges, sleeping until no reads would work. - With ranges, we may never be able to take some legal write locks without invalidating all reads. e.g., if a write lock has distinct range from all reads, it should just work, but we'd have to close the file, reopen, and re-take reads. - It's easier to just check whether the file is writable in the first place and open for writing from the start. - Lock now only opens files read-only if we *can't* write them.
2016-10-11Add byte-range parameters to llnl.util.lockTodd Gamblin1-19/+40
2016-10-11Remove need to touch lock files before using.Todd Gamblin1-12/+60
- Locks will now create enclosing directories and touch the lock file automatically.
2016-10-11Make llnl.util.lock use file objects instead of low-level OS fds.Todd Gamblin1-17/+19
- Make sure we write, truncate, flush when setting PID and owning host in the file.
2016-10-11install : finer graned locking for install commandalalazo1-0/+8
2016-10-04Read-only locks should close fd before opening for write. (#1906)Todd Gamblin1-0/+8
- Fixes bad file descriptor error in lock acquire, #1904 - Fix bug introduced in previous PR #1857 - Backported fix from soon-to-be merged fine-grained DB locking branch.
2016-09-30Fix read locks on read-only file systems (#1857)Michael Kuhn1-1/+2
2016-08-10Make Spack core PEP8 compliant.Todd Gamblin1-0/+3
2016-08-09Properly re-raise exceptions from lock context handler.Todd Gamblin1-4/+6
2016-08-09Flake8 fixesTodd Gamblin1-8/+10