[engine] Introduce static analysis model and replace validator with it

Review Request #4251 — Created Sept. 28, 2016 and submitted

nhoward_tw
pants
3890
pants-reviews
kwlzn, stuhood, yujiec

Introduces a Rule Graph that only contains the rules that cannot result in a Noop due to missing selector values. Replaces the existing validator with one backed by this new model. This means that for instance, intrinsics, or product graphs that depend on intrinsics, but are ultimately Nooped can be eliminated early.

There's still a number of TODOs left in the patch, but it's to a point where I feel like I need feedback before making further changes.

Improvements
* better error reporting over the previous implementation: The validator will now report transitive failures. Subject type match failures are also reported

Before

Found 1 rules with errors:
  (A, (Select(B),), noop)
    There is no producer of Select(B)

After

Rules with errors: 1
  (A, (Select(B),), noop):
    no matches for Select(B) with subject types: SubA

Doesn't cover
* cycles: I've got some ideas, but this has gotten pretty big already. Cycles won't cause non-termination because the graph generation algorithm only visits each rule/subject-type entry once.
* noops caused by tasks that can return None
* actually generating nodes base on the graphs
* variants
* HasProducts

Follow on work:
* performance. There are improvements to the model generation that could be done. But, I want to split that work out from this initial cut.
* integration into the scheduler. After this lands, I want to adapt the scheduler to use it for pre-filling dependencies of nodes to execute, and ultimately to simplify or eliminate SelectNodes.

Introducing and running tests for the graph and validator locally. CI passed at https://travis-ci.org/pantsbuild/pants/builds/165056642

  • 0
  • 0
  • 1
  • 0
  • 1
Description From Last Updated
NH
ST
  1. This looks great, and I can begin to see how it would replace most of the Node bodies. As you said, there is more to do to figure out how to handle portions that are conditional on runtime values rather than types.

    In general, I'd love to see a few more comments. But am shipping now because this is exceedingly well tested, and will certainly be changing further.

  2. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     
     
     
     

    At a fundamental level, it's not clear what the "node" and "edge" types in this graph are (Rules and Selectors, I guess? Or is it bipartite between Rule and Selector nodes, with unlabeled dependency edges?), it would be good to expand a docstring out to explain that if possible.

    1. That makes sense.

    2. +1 on Stu's suggestion.

  3. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     

    Seems like natively supporting "subject_type==None means 'any type'" in Diagnostic would be cleaner, as you could render a more useful error message for that case.

  4. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     

    Comment on why ("is satifiable by definition"?).

    1. Marker classes talked about below would cover these.

  5. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     

    mismatched.

    Worth adding a marker superclass?

    ps: going to love having real ADTs =D

    1. I think that could make sense. I think I could make it work with two marker classes. RuleGraphEntry can be on either side of an edge. RootRule is left hand side only. The remaining types are right hand side only. I'll think up some better names than left hand / right hand.

  6. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     

    unclear

    1. Ah, I haven't added a case for that yet, and I think it's still a bug. Not using sets for edge collections.

  7. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    A reminder: happy to temporarily drop support for producing the explicitly-requested Variants class midgraph if that makes this easier. I think sealing the graph as you're planning is significantly more important at the moment, and we can revisit to find the right way to do this in the future.

    1. I could whip up a review that removes variants since we're not using them yet. It'll be a significant simplification. I think the planner tests that depend on variants are going to cause problems when integrating the graph into the scheduler.

  8. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    Would be good to move anything like this TODO to Selector construction time, IMO.

    1. +1. Then we'd have a much better trace.

  9. 
      
YU
  1. Nick,
    Changes look awesome. I am still learning the Rule thing. So most of my comments are actually questions.

  2. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    I am confused here. Why do we want to add constraint here, given that all types in the contraint is already added?

    1. The full constraint is also a key that's used. eg there are tasks that consume that union via a Select. I considered adding a spill section for constraints that can't be indexed, but since we're only doing exact matches at this time, it wasn't necessary.

  3. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     
     
     
     

    For intrinsic product you filter with subject_types. But for task_product subject_type is not used for filtering.

    Is it possible that for a certain subject_type, a task cannot be executed thus the product of that task should not be in the return set?

    1. Yes. But the NodeBuilder doesn't have enough information to determine that. The graph does, but I haven't integrated that yet. I think one of the follow up clean up tasks will be to give things better names, which should make that clearer.

  4. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    So the check you did in __repr__ implies that self.value may not be a type instance, in that case, self.value.__name__ may result in AttributeError.

    1. Good point. The way things are now, I don't think that difference is necessary as the path where it wouldn't be a type is gone. I'll rm the conditional since it's confusing.

  5. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    seems selector is unused.

    1. Right. That was a little premature. I wanted to use the selector to narrow the dependency removal checking, but I hadn't come up with a good test case that would break it.

      Since it's not used, I can rm it.

  6. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    ditto

  7. src/python/pants/engine/rules.py (Diff revision 1)
     
     
     

    Maybe convert this to a doc string? also the comment is kinda confusing to me. Is it saying, if the input rule is unfulfillable then this RuleEdges becomes unfulfillable?

    1. ah, maybe I could rename to makes_unfulfillable(self, dep_to_remove)? Would that be clearer?

      The idea is that if rule is unfulfillable, then the owner of this set of edges also becomes unfulfillable.

    2. Ah I saw the new prototype. Actually I think the old one is clearer. I read the comment once more and now it makes more sense to me.
      Sorry about that!

      If the input parameter is an unfulfillable rule maybe just call it unfulfillable_rule instead of dep_to_remove?

  8. src/python/pants/engine/rules.py (Diff revision 1)
     
     

    Should this be moved to line 462?
    The line immediately following this comment is about handling the corner case which makes the comment kinda confusing.

    also, nit: "If"

    1. Ah, that's a hold over from an earlier iteration. I'll just rm. It's no longer necessary.

  9. 
      
NH
NH
YU
  1. Ship It!
  2. 
      
NH
NH
Review request changed

Status: Closed (submitted)

Change Summary:

Submitted as https://github.com/pantsbuild/pants/commit/9bb59594708105402856879e9a6970fbfadb0b41
Loading...