[engine] Optionally inline inlineable Nodes

Review Request #3931 — Created May 25, 2016 and submitted

jsirois, kwlzn, nhoward_tw, wisechengyi

This review adds support for inlining execution of Nodes, which skips storing them in the ProductGraph between execution, or involving them in the scheduling loop. The attached viz shows that only TaskNode and FilesystemNode (and a root DependenciesNode) are still stored in the ProductGraph... other nodes have been inlined.

It has fairly nice implications for performance... with caching disabled, cold ./pants list :: with the daemon is down to ~8 seconds on my machine, which puts it at around 5x of cold without the daemon. It also makes the remaining hotspots in the profiles significantly more obvious.

  • Remove the step(.., dependencies) argument by hiding it behind StepContext.get.
  • Add Node.is_inlineable, to indicate that a Node runs cheaply enough to not be worth memoizing in the ProductGraph.
  • Optionally inline inlineable dependencies in StepContext.
  • Move the Selector->Node factory to StepContext, which eliminates some unnecesary cross-module imports.
  • Switch legacy.graph to consuming the addresses of the root LegacyTarget nodes, and back to maybe_list (yep... sorry Yi).
  • Add ProductGraph.trace, and use it to log stacktraces.
  • Lazily format Noop messages, since they are rarely used.
  • Render cyclic/rejected dependencies as dotted lines in ProductGraph.visualize.


Loading file attachments...

  1. nice, lgtm!

  2. src/python/pants/engine/nodes.py (Diff revision 1)

    should these sort of type comparisons be an explicit is vs == throughout?

  3. src/python/pants/engine/nodes.py (Diff revision 1)

    maybe consolidate like the in check in the diff immediately below?

    if type(input_state) in (Throw, Waiting):
      return input_state
  4. src/python/pants/engine/nodes.py (Diff revision 1)

    should the check for Waiting join the below type(...) in [Throw, Return<, Waiting>] check since it's all the same behavior?

  5. src/python/pants/engine/nodes.py (Diff revision 1)

    might be a good time to add a quick docstring to TaskNode?

  6. src/python/pants/engine/scheduler.py (Diff revision 1)

    :param inline_nodes:

  7. src/python/pants/engine/storage.py (Diff revision 1)

    for readability, this and the similar call below should either consume the full horizontal width w/ aligned continuations:

    return StepRequest(a, b, c, [... -> col 100],

    or get broken out one param per line like:

    return StepRequest(a,
  2. src/python/pants/engine/nodes.py (Diff revision 1)

    The abc trick works here but I'm not a fan of lying about this being an instance property via its form. The docs are right, the impls are right (class fields), subclasses do blow up when the "abstractproperty" is undefined... so I have no wind to carry the objection. This is just a registration of discomfort...

    1. Is there a recommended alternative to accomplish this?

  3. src/python/pants/engine/nodes.py (Diff revision 1)
    You lost the full set of conflicting producers in this refactor, was that intentional?
    1. It was a premature optimization probably... the nesting is unpleasant. Will restore.

Review request changed

Status: Closed (submitted)

Change Summary:

Merged as 5cf0104a73756d8b5e3fa33a77cf3d64926976eb