Task identity fingerprint strategy

Review Request #2496 - Created July 17, 2015 and submitted

Cody Gibb
1273, 1351
benjyw, jsirois, patricklaw, stuhood, zundel
  • Implemented a TaskIdentityFingerprintStrategy, which applies the current task’s identity (which is composed of task options) to target hashes, and makes it the new default fingerprint strategy. When registering their options, tasks select which options should be included in their identity.
  • Added two new Option types -- target_list and file, so the option can be correctly fingerprinted.
  • Added corresponding PayloadFields: FileField and TargetListField.
  • Small API change to Payload / PayloadField, which can now accept an optional Context, in order for TargetListField to resolve target specs and fingerprint the targets.
  • Removed JvmFingerprintStrategy -- relevant JavaCompile and ScalaCompile options are now directly labeled "fingerprint=True", and TaskIdentityFingerprintStrategy takes care of the rest.

CI went green here: https://travis-ci.org/pantsbuild/pants/builds/71494131

  • unit testing for options payload, new option types, and new payload fields
  • integration testing using checkstyle as example task -- ensured that changing configuration / jvm tools resulted in invalidating unchanging targets


  • 2
  • 11
  • 3
  • 16
Description From Last Updated
This is a big red flag to me. Targets that have TargetList payload fields should either be injecting dependencies on ... Patrick Lawson Patrick Lawson
This seems off to me. What is this line actually trying to do? I suggest either testing kwargs.get('fingerprint') for being ... Patrick Lawson Patrick Lawson
Cody Gibb
Stu Hood
Eric Ayers
Eric Ayers
Benjy Weinberger
John Sirois
Cody Gibb
Benjy Weinberger
Benjy Weinberger
Stu Hood
Eric Ayers
Cody Gibb
Review request changed

Status: Closed (submitted)

Change Summary:

Merged as 3488a6aec78363a70e5ea4ea6d7e92a28ca57f86

Patrick Lawson

The majority of this looks good, but I have strong reservations about TargetList--see below.

src/python/pants/base/payload_field.py (Diff revision 4)

This is a big red flag to me. Targets that have TargetList payload fields should either be injecting dependencies on the specs in that list via Target.traversable_dependency_specs, or they need to live with the fact that this list only represents an opaque list of strings for the purpose of fingerprinting.

I strongly think this plumbing of context should be removed--payloads are supposed to be ignorant of target and build graph semantics. I honestly thought that we'd gotten rid of context.resolve! But it looks like it still exists in just a couple of places (but not in any core code).

  1. What alternative would you propose for the usecase of fingerprinting options?

  2. Sorry, I don't understand the question. I think everything about this is fine, but there shouldn't be any baked in support for TargetList options. They should just be treated as opaque strings from the payload's perspective.

  3. That would mean that it would be possible to change the version referenced by a tool (in say, //:scala-compiler) and not have it affect the Task fingerprint.

  4. I would argue that the fingerprint of the tools that the task depends on should therefore be computed at the last second (by the task) and passed in to the PayloadField that does this. This could be hidden behind some helpers that resolve the tool targets and then pass in either a fingerprint of those targets or the (already resolved) targets. Mostly I'm just very uncomfortable with a PayloadField taking a context. I always intended for them to be immutable and eventually serializable, and leaking context into one definitely violates this goal in a big way.

  5. So this gets back to my unease at reusing the "payload" concept for this. "Payload" is a concept specific to targets. You've been reusing it here because it has the "I am a thing that can be fingerprinted" property, but these are still two different concepts. As Patrick mentions, FileField and TargetListField aren't really appropriate for target payloads, but they are useful as option types.

    I suggest not re-using Payload/PayloadField for all this. Maybe there's some "fingerprintable" common code that can be refactored out? But if not, no big deal either.

  6. Cody and Stu, let's please address this concern in a follow-up change.

  7. Absolutely, I'll try to address this first thing next week.

src/python/pants/option/options.py (Diff revision 4)

This seems off to me. What is this line actually trying to do? I suggest either testing kwargs.get('fingerprint') for being is None or clarifying why falseyness here is the right check. What does it mean if kwargs['fingerprint'] == ''?

  1. You have a good point. But it seems intuitive that fingerprint=False or fingerprint=None should result in the option not being fingerprinted, which is currently covered rather concisely. I'd also argue that passing "weird" stuff like fingerprint='' or fingerprint=5 would fall into unspecified behavior, since you're clearly abusing the parameter.

  2. I would prefer that unspecified behavior instead be a clear error or handled in a well-defined, documented way. This particular option is going to be used a lot (based on my read of this change), and often used by plugin authors and third parties who are less likely to be familiar with the implementation details of the interfaces they're using.