Parallize thrift linter

Review Request #4351 — Created Nov. 3, 2016 and submitted

wisechengyi
pants
pants-reviews
benjyw, mateor, nhoward_tw, stuhood, zundel

Utilize WorkerPool to get more mileage out of thrift-linter with rougly 20% saving on time.

time for run in {1..10} ; do ./pants -q --no-cache-read clean-all thrift-linter <some large project> ;done

From 19m32s to 15m47s

https://travis-ci.org/pantsbuild/pants/builds/173574314

  • 0
  • 0
  • 0
  • 1
  • 1
Description From Last Updated
  1. 
      
  2. Instead of using submit_async_work here, you could construct a Work instance with arg_tuples for each vt, then use submit_work_and_wait.

    It might also make sense to tear down the pool when we're done with it.

    1. I tried submit_work_and_wait, but it returns a list of exit codes and halts when the first thread raises any error, which means it can catch only one exception. Therefore I had to stick with the current implementation where I can wait on MapResult objects for each Work.

    2. Well that's no good.

  3. I'm concerned that this might cause the errors to be swallowed. Are there tests that cover the error reporting?

    1. Existing tests contrib/scrooge/tests/python/pants_test/contrib/scrooge/tasks/test_thrift_linter_integration.py checks for LINTER-ERROR in output and exit status, which should be sufficient.

    2. I think I'd still prefer to keep the existing behavior for the TaskError message. The results still contain that info, so we should be able to do it.

      Having a list of the targets that failed printed at the end of the output is valuable in my opinion.

    3. Added back mechanism to collect errors and some more test coverage.

    4. Thanks!

  4. 
      
  1. LGTM!

  2. might be good to assert that the target name is there too.

    1. Thanks, good point! Added.

  3. 
      
Review request changed

Status: Closed (submitted)

Change Summary:

0093628056bbf454faf35709ee3c61acc133d0e9

Loading...