[engine] Add Runnable State

Review Request #4158 — Created Aug. 13, 2016 and submitted

stuhood
pants
3783
pants-reviews
jsirois, kwlzn, nhoward_tw, patricklaw, peiyu

This review adds the Runnable State, which a Node returns when it is ready to run a task, process, or filesystem operation. The Engine now deals only with the pickleable top level functions and arguments represented by Runnables. As expected, this significantly simplifies caching, and makes profiles much more readable (because rather than occuring deep inside the Scheduler, task functions are called from shallow locations in the Engine). Thanks to Nick and Timur, who both suggested that this would be clearer.

Profiling after this change indicates that about 80% of execution time is spent in scheduling, while the remaining 20% happens in the top-level task and filesystem functions. The separation it introduces should help in significantly reducing the scheduling overhead.

  • Add a TargetAdaptor for RemoteSources to make it pickleable.
  • Make Engine.start() implicit, since it was being forgotten in a bunch of places anyway.
  • Drop the scheduler.{Promise,StepRequest,StepResult} classes and convert Scheduler.schedule to a coroutine that yields Runnables and expects the Engine to pass back results.
  • Set Process.daemon=True and re-enable all Engine tests (perhaps optimistically?)
  • Split out most of the body of ProcessExecutionNode into a top-level function for pickleability.
  • Drop the type and string from Key; may eventually reintroduce, but currently unused and a bit awkward.
  • Replace usage of StepResult and StepRequest in Storage with support for storing State instances: a Runnable state goes in, and Return or Throw come out.

https://travis-ci.org/pantsbuild/pants/builds/155526614

  1. lgtm! few misc nits/comments.

  2. src/python/pants/engine/engine.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     

    missing doc for :param runnable:

  3. src/python/pants/engine/engine.py (Diff revision 1)
     
     
     

    reuse the already retrieved temp var here?

    ... if is_cacheable else None
    
  4. src/python/pants/engine/isolated_process.py (Diff revision 1)
     
     
     
     
     

    nit: these params seem to have gotten misaligned indent-wise during the move

  5. realize you're just porting, but this should probably get a more distinguishable exception type vs Exception - SnapshotExecutionFailure or NonZeroExit etc.

    you could also move the exception string construction into the exception type's __init__ as well so you could just do something like:

    return Throw(NonZeroExit(process.binary, process_result.exit_code))
    

    prior art: https://github.com/pantsbuild/pex/blob/master/pex/executor.py#L13-L44

  6. src/python/pants/engine/scheduler.py (Diff revision 1)
     
     

    nit: use a tuple vs list here?

  7. src/python/pants/engine/scheduler.py (Diff revision 1)
     
     
     

    is this bare yield intentional? if so, maybe a comment describing its purpose?

  8. 
      
  1. 
      
  2. 
      
Review request changed

Status: Closed (submitted)

Change Summary:

Merged as 14759e1065e307c7760299670da30f64ffdbf783

Loading...