Increase robustness of ProcessManager.terminate() in the face of zombies.
Review Request #2513 - Created July 22, 2015 and submitted
|areitz, benjyw, jsirois, nhoward_tw, stuhood, zundel|
- Implement zombie detection in ProcessManager.is_alive() - Implement stale pid detection in ProcessManager.is_alive() and enforce for NailgunExecutor - Misc refactoring + improved usage of psutil.Process attributes in support of the above - Tests
https://github.com/pantsbuild/pants/pull/1848 + setup a centos6 vm, created a zombie process via a test harness and attempted to kill it via the ng-killall code path (which relies on ProcessManager.terminate() & is_alive()). verified this change successfully prevents the exception case on linux when trying to kill a zombie'd process.
I knew this, but reading up on zombie support in psutil it drove home exactly how far behind we are. The maintainer blogs about zombie handling being broken and fixed recently here: http://grodola.blogspot.com/ I'm fine with and even prefer a seperate RB, but I think its time to upgrade. Its true that this is a native lib so pants users who build corss-platform pexes for pants will have extra work to do to upgrade, but it would be good to get past the zombie bugs and onto a supported year.
- Safeguard against a race from pid_exists() -> Process() by relying on pid checking in Process.__init__(). - Drop pid arg to is_alive() - Test fixes
Revision 2 (+81 -27)