Support caching chroots for reuse across pants runs.

Review Request #2349 — Created June 12, 2015 and submitted

benjyw
pants
1669
cf8f326...
pants-reviews
jsirois, patricklaw
- Creates huge speedups e.g., when running tests.

- Not turned on for any code yet.  To work safely
  this requires a change to the pex library to support
  creating chroots via copying instead of hard-linking,
  and we're stil waiting on upgrading to a version of pex
  with that change. However this has undergone enough
  "unsafe" testing to verify that it works and provides
  peformance benefits, so I'm submitting it now, to prevent
  it from drifting.

- Incidental changes that this commit required:
  * PythonChroot no longer deletes itself when GC'd. It's generally
    a bad idea to rely on cleanup in __del__ anyway, as there's no
    guarantee it'll ever be called.
  * Simplified some of the interface to PythonChroot. For example,
    you no longer specify an executable name - the chroot creation
    code hard-codes one for you.  The only caller we had for this
    was hard-coding a name anyway, so no real loss of functionality
    there.
  * Modified the backtrace munging trick in python_eval to rely only
    on the parent of the chroot dir, not the chroot dir itself, as
    that is no longer known when we generate the eval's entry point: we need
    to generate that entry point so we can hash it and use that hash
    as input to the function that generates well-known chroot paths.

CI passes: https://travis-ci.org/pantsbuild/pants/builds/66601000.

JS
  1. 
      
  2. This is a general problem we have with the artifact cache.  It would be nice if this custom caching were replaced with an upstream task that produced PythonChroot products instead.  That will required the new engine though where downstream doesn't just ask for PythonChroot products, it asks for them in a qualified way saying one PythonChroot for this target, 1 for these 3 targets together, etc.
    
    Its perhaps worth a nod in the TODO that the real place to solve this is globally in the artifact_cache.
    1. Yeah, I spent about 10 minutes investigating if I could make this happen with products, but it was too big a hoop to jump through. Added a TODO though.

  3. Shouldn't this use PythonSetup.instance_for_task(self) to account for subclasses that might override PythonSetup options in their own scope?

    1. Possibly, but that's orthogonal to this change, and would need to be done carefully and in a few different places.

  4. In this case the contextmanager makes sense so it makes sense to use with temporary_dir() as path:

    1. This is as-is from the old code, and I assume the reason is that chroot.delete() kills the tmpdir anyway. I think it might be confusing to use a temporary_dir() context, because it wouldn't be clear why that was necessary, and also because changing the cleanup arg to False wouldn't work (because chroot.delete() would still run) and that would be potentially confusing.

      Added a comment to that effect though, to clarify.

  5. Instead of hacking this in, it seems like the right thing to do is call `self.reset_build_graph()` and then re-resolve the b_library - returning it to the caller.
    1. Ah, good idea. Done.

  6. 
      
BE
JS
  1. Ship It!
  2. 
      
BE
BE
Review request changed

Status: Closed (submitted)

Change Summary:

Submitted as f5933801a899f6d17e94fcba253fbe8ffbc98f1c.

BE
  1. Thanks John! Submitted as Thanks John!.

  2. 
      
Loading...