Factor workunit failure into final exit code.

Review Request #4244 — Created Sept. 17, 2016 and submitted

wisechengyi
pants
3882
pants-reviews
benjyw, kwlzn, nhoward_tw, peiyu, stuhood, zundel

Previously if a workunit fails, the run will end with Pants printing FAILURE, but the exit code is 0.

This change factors the result of workunit into the final exit code as well.

Before:

$ ./pants --pythonpath=tests/python --backend-packages=pants_test.goal.data do-some-work

23:41:53 00:00 [main]
               (To run a reporting server: ./pants server)
23:41:53 00:00   [setup]
23:41:53 00:00     [parse]
               Executing tasks in goals: bootstrap -> do-some-work
23:41:54 00:01   [bootstrap]
23:41:54 00:01     [substitute-aliased-targets]
23:41:54 00:01     [jar-dependency-management]
23:41:54 00:01     [bootstrap-jvm-tools]
23:41:54 00:01     [provide-tools-jar]
23:41:54 00:01   [do-some-work]
23:41:54 00:01     [do-some-work]
23:41:54 00:01       [non.existent.main.class]
                     ==== stderr ====
                     java.lang.ClassNotFoundException: non.existent.main.class
                        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
                        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
                        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
                        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
                        at java.lang.Class.forName0(Native Method)
                        at java.lang.Class.forName(Class.java:264)
                        at com.martiansoftware.nailgun.NGSession.run(NGSession.java:242)

                     ==== stdout ====

23:41:54 00:01   [complete]
               FAILURE

$ echo $?
0

After:

$ ./pants --pythonpath=tests/python --backend-packages=pants_test.goal.data do-some-work

23:41:53 00:00 [main]
               (To run a reporting server: ./pants server)
23:41:53 00:00   [setup]
23:41:53 00:00     [parse]
               Executing tasks in goals: bootstrap -> do-some-work
23:41:54 00:01   [bootstrap]
23:41:54 00:01     [substitute-aliased-targets]
23:41:54 00:01     [jar-dependency-management]
23:41:54 00:01     [bootstrap-jvm-tools]
23:41:54 00:01     [provide-tools-jar]
23:41:54 00:01   [do-some-work]
23:41:54 00:01     [do-some-work]
23:41:54 00:01       [non.existent.main.class]
                     ==== stderr ====
                     java.lang.ClassNotFoundException: non.existent.main.class
                        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
                        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
                        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
                        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
                        at java.lang.Class.forName0(Native Method)
                        at java.lang.Class.forName(Class.java:264)
                        at com.martiansoftware.nailgun.NGSession.run(NGSession.java:242)

                     ==== stdout ====

23:41:54 00:01   [complete]
               FAILURE

$ echo $?
1

https://travis-ci.org/pantsbuild/pants/builds/161491405

  • 0
  • 0
  • 1
  • 0
  • 1
Description From Last Updated
  1. i think it might be legit to think workunit as side work, and not let its result affeect the main run.

    my understanding of the team is they actually want this to be warning, only surprised this now prints out a FAILURE, although our $? is 0.

    can we just ask them to add something like print_in_red('WARNING: Failed to run ensure tests', file=sys.stderr) to the plugin?

    1. seems to me like the crux of the problem solved here is to avoid the "prints FAILURE but exits 0" case, which is the super confusing/awkward bit because it passes CI (and users dont examine the logs closely) but appears to fail on laptops because users typically read the output and don't check $?.

      it doesn't seem like inverting this (i.e. to print "SUCCESS and exit 0" when there's a failed workunit) is the right way to go - so my guess is that we probably want the exact behavior here + perhaps some plumbing for superior workunit control for warn-only consumers NailgunTaskBase.runjava and friends?

      or instead, perhaps we could just sway the task consumer author in question here to embrace the failure mode? we know from experience that warnings tend to get ignored - so if it's worth warning about, it's probably worth failing for too.

  2. 
      
  1. lgtm!

  2. should this variable be called run_tracker_result instead?

  3. src/python/pants/goal/run_tracker.py (Diff revision 1)
     
     

    missing the trailing : in :return: ...

  4. src/python/pants/goal/run_tracker.py (Diff revision 1)
     
     
     
     
     

    more concise:

    return 1 if outcome == WorkUnit.FAILURE else 0
    
    1. You might want to also return non-zero on abort.

    2. simplified and included ABORT

  5. 
      
  1. 
      
  2. I'm not sure if we're guaranteed that goal_runner.run() returns a non-negative code. For example, sys.exit(-1) is valid in Python (the shell will interpret it as 255).

    1. It looks like junit_runner deals with that by calling abs() on the result values. I think that could work.

    2. sounds good! will add that

    3. corrected.

  3. 
      
  1. Did you try running your tests without your fix? When I did that, they both passed. I flipped the assertion, and saw that the failing one is raising an AttributeError.

    1. Thanks for catching this. It was insufficient to just check run failure, hence added the following as well

          self.assertIn('[run-dummy-workunit]', pants_run.stdout_data)
          self.assertNotIn('Exception', pants_run.stderr_data)
      
  2. tests/python/pants_test/goal/data/register.py (Diff revision 1)
     
     
     
     
     

    You'd have less machinery involved if you just used context.new_workunit along with set_outcome. You could also test that background runs will also have the behavior by having a flag that causes context.submit_background_work_chain to be invoked.

  3. tests/python/pants_test/goal/test_run_tracker_integration.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    Could you add docstrings to these? It's not clear from looking at them how they cause a workunit success / failure.

    You could name the goal something more directly related to what you're testing.

    1. replaced --do-nothing with --[no-]success flag, so should be more self explanatory now.

  4. 
      
  1. Ship It!
  2. 
      
Review request changed

Status: Closed (submitted)

Change Summary:

b30acb6f857f938b82809317742bc57444683432 thanks gents!

Loading...