Added invalidation check to UnpackJars task

Review Request #1776 — Created Feb. 15, 2015 and submitted

benjyw, jsirois, patricklaw

Added invalidation check to UnpackJars task.

This fell out of another change I'm working on to add caching to the ivy-imports task.

I feel like something is lacking in the test department. Is there a good example of how to unit/integration test this logic?

Added a unit test for the new FingerprintStrategy.

I ran this command repeatedly:

PANTS_DEV=1 ./pants compile ./examples/src/java/com/pants/examples/protobuf/unpacked_jars 

and didn't see any invalidations after the second one. This is the same as existing behavior, but I set a break point in the task to see that invalidation was working.

I also played with changing the downstream JarDependency version. This took me a while to get right, hence the new UnpackJarsFingerprintStrategy class

  • 0
  • 0
  • 5
  • 1
  • 6
Description From Last Updated
  2. Chanted superclass to inherit __hash__ and __eq__

  3. I was using this as a model and it seemed overyly complex

  4. I don't understand why invalidate_dependents would ever be false?

    1. It is in buildgen. There are definitely situations where you're only concerned with target-by-target invalidation, regardless of dependency structure. In buildgen's case, this is doing lexical analysis on source, which doesn't care at all about dependency structure.

  5. This should never happen unless someone nutzes around in .pants.d, but when I had a bug in my code it would trigger.

  6. tests/python/pants_test/ (Diff revision 1)

    I wanted to change around a definition in a BUILD file. Let me know if there is already precedent for something like this.

  1. Re testing, its all there, the trick is to return the actual targets operated upon from execute and raise a TaskError subclass with the successfully processed and failing targets as members.  See here - 97% test coverage for a task with semi-sophisticated invalidation:
    Also note Benjy's comments, its slightly hacky to use Task.execute / TaskError like this, but I think its a net win.
    1. Its almost all there... 1772 hasn't landed yet and there is a crucial fixup in TaskTest.setUp() needed to make it work...

    2. It has landed @
  2. I do not think this should be done. Or at the very least, we should restructure it somehow.

    If this gets cargo-culted into a FingerprintStrategy subclass that has __init__ params that affect fingerprinting (see JvmFingerprintStrategy), some very subtle invalidation bugs can result. This is because fingerprints are cached on the Target base class, and the cache key is the instance of the FingerprintStrategy, i.e. its __hash__.

    Maybe we should have a DefaultFPHashing mixin? Or we should force _hash and _eq abstract methods to make it very visible? I almost wish I could just make DefaultFingerprintStrategy "final".

    1. I reverted this change and added a mixin.

  3. Are you concerned about order stability here?

  4. Is this check necessary? I didn't think you got more back from invalidated() than you put in.

    1. I was passing more targets to invalidated() but I didn't need to.

  5. Are you sure you want to do this rather than letting a successful exit from the context manager update all of them as a "transaction"? Is this a performance hack?

    1. Didn't know about that. Fixed, added a comment to Task.execute()

  2. Is there a reason for this to be a subclass of TaskError? It isn't really an underlying task failing, but sort of an internal error.

  3. Racing with John to get this in.

    1. I'm starting to get temporaly very confused by your comments.  This is very much in master:
  2. I've been in the habit of sorting by the cache_key, or whatever string you're ultimately rolling into the hasher, since theoretically renaming a target shouldn't affect anything if the target is otherwise unchanged.

  3. Now unnecessary

  4. Docstring this? Ideally with a nice big warning about not using this when the mixed in class has instance attributes mixed into its fingerprints.

  1. Thanks for the TaskTest
  2. Odd indent, you probably want:

    unpacked_jars_list = [t for t in ...
                          if isinstance(t, UnpackedJars)]
  3. perhaps extend(...) instead of re-assign
  4. 2 blank lines between top-level types.
  2. Looks like I needed a rebase, sorry for the confusion.

Review request changed

Status: Closed (submitted)

Change Summary:

Thanks for the reviews! Commit a18627e