Multiprocessing Cache Check and Write

Review Request #1265 - Created Oct. 31, 2014 and submitted

Information
David Taylor
pants
754
1267
2e63884...
Reviewers
pants-reviews
benjyw, johanoskarsson, patricklaw, zundel

This is a slight refactor of a change we've been using internally at foursquare for a few months.

Avoiding recompression during downloads (https://rbcommons.com/s/twitter/r/1233/) reduces the amount of this work we do, while this change is about doing it faster. Most cache work is CPU-bound gzipping of the tarballs, so moving cache fetch/decompression and compression/upload to into subprocesses improves multicore utilization.

This is done using a multiprocessing.Pool, passing ArtifactCaches and the arguments to their methods to subprocesses (which is why they were made pickle-friendly in previous changes).

Additionally, we discovered (through a very painful debugging process) that python's zlib bindings (in 2.x) use a global static lock which, if in use by any thread when a subprocess is created via fork(), can be inherited by children in a locked state, meaning the child will deadlock if it later tries to use zlib.

This is why the pools are created early in run_tracker, rather than as-needed, and is discussed in the comment on SubprocPool in worker_pool.py.

./pants goal test tests/python/pants_test/cache::
https://travis-ci.org/pantsbuild/pants/builds/39635717

Tested the debug message with https://gist.github.com/dt/a5d72f580b3fb9a60466 but not checking that in since wall-clock sleeps in automated tests, for just a log message, don't seem worth it.

Issues

  • 0
  • 3
  • 0
  • 3
Description From Last Updated
David Taylor
Patrick Lawson
David Taylor
Eric Ayers
David Taylor
Eric Ayers
David Taylor
Eric Ayers
David Taylor
Review request changed

Status: Closed (submitted)

Change Summary:

f730bc82ae1f326dcc0fd28ea94771388a3bf46c
Loading...