[engine] Pass selectors to select nodes; Use selectors in error messages

Review Request #4031 — Created July 18, 2016 and submitted

nhoward_tw
pants
3611
pants-reviews
jsirois, patricklaw, stuhood, zundel

Currently selectors are dropped when select nodes for task nodes are instantiated. This means that if a selector fails to match, the error message can't reproduce the selector.

This changes the select nodes so that they are passed the selector and can reuse it for producing error messages. It additionally overrides the __repr__ implementations for selectors so they'll be displayed similarly to how they were declared.

Currently a select failure will look like this:

Noop(msg=u"No source of SelectNode(subject=PathGlobs(dependencies=(PathDirWildcard(canonical_stat=Dir(path=u''), symbolic_path=u'', wildcard=u'fs_test', remainder=u'a/b/*'),)), product=<class 'pants_test.engine.test_isolated_process.Concatted'>, variants=None, variant_key=None).").

With the repr and message changes, it looks like this:

Noop(msg=u"No source of SelectNode(subject=PathGlobs(dependencies=(PathDirWildcard(canonical_stat=Dir(path=u''), symbolic_path=u'', wildcard=u'fs_test', remainder=u'a/b/*'),)), variants=None, selector=Select(Concatted)).")

CI away on PR.

  • 0
  • 0
  • 0
  • 2
  • 2
Description From Last Updated
ST
  1. It's not clear that displaying repr(Selector) is more readable for users than displaying repr(Node). Is there a specific case that you were trying to make more readable?

    More generally, I'm uncomfortable with the fact that there would now be a bunch of redundant data in the Node's identity, because any information in the Selector will already have been in the Node's identity, and is now in there in a second form.

    1. For point one, I think the root question is: Are nodes user facing, or are they internals? If they are user facing, then exposing repr(Node) is fine. But if we aren't expecting users to interact with nodes directly, displaying them in traces won't be helpful. I agree that my approach here was a little haphazard. I'd only touched the Noop's that I ran into while working on the process execution patch. Also, there aren't tests that exercise theses messages. How about I just revert the state messages for now?

      For your second point, I didn't remove the redundant information from SelectNode yet because I didn't want to decompose it as part of this step. I still think the patch as it is does clarify things. It makes it so that when you see a SelectNode in a trace, its selector will be visible.

      I did remove the redundant information from the datatype signature of the other nodes. I think I see a way to do that with SelectNode as well, so I'll take a crack at it.

    2. I think the root question is: Are nodes user facing, or are they internals?

      I don't think that either of Node or Selectors are supposed to be user facing. If there is a particular usecase you are trying to improve for the end user, it would likely be preferable to do something specific in context there, and then if we begin to notice a pattern we can make it the default output.

    3. The use case was me, trying to understand the failure traces I got through the fs ruleset when I wasn't familiar with it. I reverted the message. I think it could be better, but I think a better approach would be to pick some scenarios and go after them specifically.

  2. This doesn't seem to be testing anything useful... the output is created by namedtuple, so this is effectively a test of namedtuple's repr.

    1. It doesn't test namedtuple's repr, because I overrode __repr__ on all of the selector classes. namedtuples repr always includes the kwarg and doesn't elide optional arguments. Additionally, type arguments are presented using the default type representation, so they end up looking like Select(product=<class 'pants_test.engine.test_isolated_process.Concatted'>). The end result is that selectors in messages look nothing like their definitions and are much more verbose.

      The patch tries to make their repr look like the literal declaration, so the previous example would become Select(Concatted). I think it would also be reasonable to use FQNs for the fields containing types if that would be better.

  3. 
      
NH
ST
  1. Thanks, this looks much better.

  2. src/python/pants/engine/nodes.py (Diff revision 2)
     
     

    I think that one thing that continues to make me uncomfortable is that not all Nodes have Selectors (as implemented). How a FilesystemNode ended up selected is a hole in the improved debugging that this offers.

    One case that I think exists is that in some cases a Node can be instantiated from different Selectors (SelectLiteral/Select come to mind). So a Selector isn't (currently) a Node property so much as an "edge" property.

    Also, it seems like it might be possible to clearly distinguish between intrinsic Nodes (those without Selectors, I think?) and user declared Nodes, and that might clean up the Node subclasses in some way.

    1. I think it makes sense that not all nodes have selectors. Select nodes represent edges. They aren't cached and can be inlined. They don't have values in themselves. Instead, they propagate and combine values from other node types in specified ways. Task nodes, and intrinsics are different. They compute values from outside the graph.

      I see what you mean about Select vs SelectLiteral, but I'm not sure how often you'd see that in practice, because of the subject change in SelectLiteral.

      I do like the idea of having intrinsic nodes distinguished in debugging representations better, but I'd like to punt on that. I think a targeted diff would be easier to review. Filed an issue here. https://github.com/pantsbuild/pants/issues/3727

    2. Works for me.

  3. 
      
NH
Review request changed

Status: Closed (submitted)

Change Summary:

Submitted as https://github.com/pantsbuild/pants/commit/c08c8126402772840d66d392a2217a9371a363d0
Loading...