Clean up after failed artifact extractions

Review Request #4255 — Created Sept. 22, 2016 and submitted

stuhood
pants
3894, 3895
pants-reviews
patricklaw, peiyu, yujiec

In the case of a corrupted remote artifact, we rely on tarball checksums to determine whether an extraction attempt has failed. But because we don't see tarball extraction errors until after some of the contents of the tarball might have been extracted, we might have already polluted the workspace.

Luckily, for Tasks that set cache_target_dirs, the contents of the cache artifact should all be located under the results_dir, meaning that clearing the results_dir in case of failure to extract should purge partially extracted artifacts.

  • Clean up after failed artifact extractions which specify a results_dir.
  • Move initialization of results_dirs before cache lookups and validate that they are are set by accessing them via their property whenever cache_target_dirs is set.
  • Deprecate compression level ==0, which disables gzip compression entirely, and doesn't seem to do any checksumming (without compression, none of truncating/mutating/etc of the test file were detected)
  • Extract the pants_test.cache.cache_server helper to provide a "remote" cache server for integration testing
  • Add an end-to-end corruption test for java compilation

green ci: https://travis-ci.org/pantsbuild/pants/builds/165385610

ST
PE
  1. 
      
  2. can't tell how tarball extracts itself into results_dir

    i tried to see how results_dir is used by placing a conditional breakpoint results_dir is not None, suprisingly it never stopped, wonder this has become dead code. I ran only one target, but worth a check

    1. It's used here: https://github.com/pantsbuild/pants/blob/master/src/python/pants/task/task.py#L492-L496 and is called from a separate process, which is probably why your breakpoint didn't fire.

  3. 
      
ST
ST
PE
  1. 
      
  2. Looks results_dir is always None at this point.

    Because the only place called by task that sets _results_dir to some value is in

    self._maybe_create_results_dirs(invalidation_check.all_vts)
    

    which happens after the cache check

    self.check_artifact_cache(self.check_artifact_cache_for(invalidation_check))
    

    steped through zinc_compile phase and confirmed above.

    1. That was fixed in diff 2. Please take another look.

    2. ah, missed _maybe_create_results_dirs is placed earlier in new diff. Still have one question though, left a comment.

  3. 
      
PE
  1. 
      
  2. since results_dir is linked to stable name, in the case of corrupted artifact, we will erase the /current/? is this going to be recreated by compile?

    1. Good point. I've gone ahead and switched this from rmtree(..) to mkdir(.., clean=True).

  3. 
      
ST
PE
  1. 
      
  2. stable name is first created as a symlink, now this is going to change it to a directory (the way safe_mkdir is implemented), seems problematic?

  3. 
      
PE
  1. 
      
  2. not obvious since result_dir does not seem to be actually used, it happens to be the symlink that points to the extracted artifact directory, worth a precondition check or at least document

  3. 
      
ST
ST
YU
  1. Looks good! A few minor comments.

  2. Since it's just calling base class method, this should not be necessary?

  3. whitespace

  4. ditto.

  5. 
      
PE
  1. Thanks for the thorough tests!

  2. maybe a NB: results_dir is not supposed to be symlink, or just a precondition check

  3. src/python/pants/task/task.py (Diff revision 4)
     
     

    worth a comment on using current_results_dir as opposed to the stable dir?

  4. remove trailing ws

  5. remove trailing ws

  6. 
      
ST
ST
Review request changed

Status: Closed (submitted)

Change Summary:

Merged as ce5f34706e4791bfe32c94de82f1233360e3b110

Loading...