fix/refactor checkstyle

Review Request #1432 — Created Dec. 5, 2014 and submitted

jinfeng
pants
fa8e473...
pants-reviews
benjyw, ity, jsirois, tejal, zundel

- Renaming BUILD.tools's twitter-checkstyle target to checkstyle since this is not
twitter specific
- Adding build-support/checkstyle/* to wire up the checkstyle
- A few longer than 100-character per line java files fixes
- pants.ini, fix/change broken [checkstyle] to [compile.checkstyle]
- pants.ini remove suppression_files setting:
There are a cuople reasons for doing this:
1) OS pants and com.puppycrawl.tools.checkstyle only supports single suppression file.
[HTML_REMOVED] The multiple suppresion files concept is from a Twitter internal custom filter, which
[HTML_REMOVED] isn't applicable here.
2) But more important, suppression isn't a required component for running checkstyle.
[HTML_REMOVED] Someone can perfectly author a coding_style.xml which contains no SuppressionFilter
[HTML_REMOVED] filter. And right now the checkstyle internally reads suppression_files list and injects
[HTML_REMOVED] them into the properties, which assuming the hardcoded SuppressionFilter. Doesn't make
[HTML_REMOVED] any sense. Anyone wants to use suppression, can still put checkstyle.suppression.file
[HTML_REMOVED] into properties setting under [compile.checkstyle] in OS pants, or any company specific
[HTML_REMOVED] pants.ini
- Adding the checkstyle task into the optional backend so when we run PANTS_DEV=1 ./pants
in OS pants, it's added, however, it's not by default wired up in pants release. Anyone
can still wire it up in their own wrapper (like what Twitter currently does)
- In ide_gen, currently it tries to read suppression_files from [checkstyle]. First, like
I mentioned above, the default open source checkstyle doesn't have multiple suppression
files concept. Second, the current setting points to an invalid file that doesn't exist
anyway. So instead, I'm setting it to [] for now and will follow up with ide/idea owners
later.
- Adding unit tests for checkstyle to prevent future regressions.

Merge branch 'master' of github.com:jinfeng/jinfeng-pants-fork into jin_add_checkstyle_to_optional_backend

PANTS_DEV=1 ./pants goal compile examples/src/java::
PANTS_DEV=1 ./pants goal test tests/python/pants_test/backend/jvm/tasks:checkstyle

Travis baking:
https://travis-ci.org/jinfeng/jinfeng-pants-fork/builds/43051952

  • 0
  • 0
  • 5
  • 0
  • 5
Description From Last Updated
JI
ZU
  1. 
      
  2. BUILD.tools (Diff revision 2)
     
     

    I would like to see if we update the version here and see that things still work? 5.6 is too old to support Java 8. 5.9 supports Java 8 and 6.1.1 is the latest. If this feels like scope creep I can follow on later.

    1. Let's do it in another rb.

  3. pants.ini (Diff revision 2)
     
     

    Note that 'ide_gen' also reads the [checkstyle] section. If you make a change here then you probably need to update that if you want to continue to get checkstyle support in the IDE.

    1. I've modified ide_gen.

  4. pants.ini (Diff revision 2)
     
     

    Since supression_files has been renamed, add an entry for it in src/python/pants/option/migrate_config.py

  5. It doesn't make a lot of sense to me to have checkstyle defined in the core backend section, but only optionally wired up here. The internal_backend is really only intended for pants development.

    Maybe there is some other way we could turn checkstyle on and leave it off by default, like a options 'enabled' or something that defaults to false?

    1. take a look https://rbcommons.com/s/twitter/r/1145/ look for John's comments

    2. I'm not sure if more explanation is needed, Eric has since created a page describing setting up a small plugin setup in a new repo, but even after we seperate the current jvm backend into a java backend such that a java project can just install the java backend, forcing checkstyle on means we pick a default style. We could reasonably pick Sun style and express an opinion forcing folks to turn off or re-configure the style in-play, but it seems to me the build tool shouldn't force language styles, it should just make it easy to turn enforcement on. Today turning a plugin on takes more than a single pants.ini line; however, given fine-grained plugin distribution and David Taylor's current work on plugin resolution, it would just be a single line containing the checkstyle plugin pypi requirement to fetch and activate the plugin. A second line to enable canned styles - say Sun or Google, and more lines if the user wanted to setup a custom checkstyle config.

  6. I don't know why this would be needed today, but I'd like to hear the back story.

    1. See http://checkstyle.sourceforge.net/config_coding.html 'RedundantThrows'. That check was historically used by Twitter and requires a classpath with all pussible exception base classes. That should explain the 'resolve' dep - and the 'gen' dep is there, because, for example, an exception thrown from an RPC might very reasonably extend org.apache.thrift.TApplicationException which is a synthetic runtime dep for thrift java gen targets injected by pants in the gen phase just before resolve.

  7. did you mean to revert th description here?

    1. Yes. the description becomes the compile description, no the scalastyle description once I thought it was. It looks weird and wrong.

  8. I'm not sure anyone is using exclusives_groups support any more. Foursquare was using it but they've substituted something else.

  9. Note, these lines are copied from ide_gen.py. There is this relationship between ide and checkstyle that we need to sort out. It would be nice I think if Ide and Checkstyle provided some static method to return these config variables instead of dipping directly into config in case we need to move this configuration around.

    1. this is ide_gen.py. I didn't change idea_gen.py.

      I agree idea and checkstyle should share something, that's what I wrote in the comment, that I'll follow up with Tejal.

  10. Nice tests! thanks.

  11. 
      
JI
ZU
  1. 
      
  2. 
      
PA
  1. 
      
  2. Use parens around the entire expression rather than backslashes. Also, the and trails the expression rather than going on the next line.

  3. 
      
JI
IT
  1. 
      
  2. are we removing/keeping this then?

    1. It should be kept - at least partially. There are 2 mainly orthoganal things going on here, confs and classpath augmentation. Lets ignore confs to start - the egroups stuff is all there to pluck out the classpath of the code being checkstyled and that's needed for certain checkstyle rules. The conf bit is secondary to that. It will eventually go away completely, but may not be able to be factored out today.

    2. I'm not following:

      The original code is:

      egroups = self.context.products.get_data('exclusives_groups')
      etag = egroups.get_group_key_for_target(targets[0])
      classpath = self.tool_classpath(self._checkstyle_bootstrap_key)
      cp = egroups.get_classpath_for_group(etag)
      classpath.extend(jar for conf, jar in cp if conf in self.get_options().confs)
      

      Note in my diff, I've moved

      classpath = self.tool_classpath(self._checkstyle_bootstrap_key)
      

      to the top of the function, so that's still there. The rest seems to me all related to confs stuff: getting egroups -> getting etag -> getting cp for the etag -> adding those cp whose conf appears in confs argument. If we said confs is no longer relevant, I don't see why those lines need to be kept.

    3. Its true confs are generally not relevant - we are trying to kill them, that said - forget about confs for a second - they truly are secondary here.
      There are classpaths:
      1. the tool classpath - this is needed to be able to run checkstlye at all, even just checkstyle -h
      2. the classpath of the code being checked

      Its the second classpath that is absolutely required by checkstyle rules like XXX. Unless you want to remove support for ./pants goal checkstyle checking those sorts of rules, that classpath must still be present after this change.
      The fact that you still need confs to populate a classpath is an artifact, we'd like to remove it, but folks reading classpaths still need to deal with it. In practice the confs always is the tuple ('default',) so the guard in the for comprehension here could likely be removed: classpath.extend(jar for conf, jar in cp if conf in self.get_options().confs). That said - you are not here to shave the conf yak, so you I'd just leave the code as-is.

    4. Where XXX is "See http://checkstyle.sourceforge.net/config_coding.html 'RedundantThrows'" - I talked about this example above in a reply to Eric's 1st feedback section.

    5. conf and classpath restored

  3. 
      
JS
  1. 
      
  2. If we need no suppressions today - why allow them at all in the pants repo?

    1. My thought is to show how people can use it. If we remove the suppresssions.xml, then the properties and 'checkstyle.suppression.file' setting would be removed from pants.ini as well. Not necessarily straightforward for people to figure out.

    2. OK - categorical imperative again - not sure of the answer, but do we want to express every default for every task in pantsbuild/pants own config? This could have the effect of people knowing the now-hidden customization incantations, but not knowing they are optional. In otherwords they are very likely to cargo-cult the config when they do not need to. The primary gate to pants 1.0 is still that a user need only touch pants.ini in their repo after installing pants with pip globally.

      It seems to me we need 2 pieces of information for good examples, what is optional configuration if you need to go custom, what is required configuration. Now pantsbuild/pants will always be central - if folks have a question about pants setup, they'll likely come to this repo and see the example config. Plugin writers though do not share this priviledge of prime location - no one will bother finding their plugin repo to look at its maximal config for the 1 task the plugin exports. Much much better is providing a pluggable way to allow a pants user to discover this information uniformly. We have the command line help today and that can be improved, but it treats pants core and random plugin equally, all options are displayed the same. For higher level docs / examples we have no established solution. It could be that ./pangt goal dictionary <task name|target name> gets built and that pops up docs in a browser. Those docs might be embedded in plugins in the form of markdown resource files in some uniform way - I'm not sure.

      Either way - I am of the opinion that doc examples will scale better than checked in examples even in the gap until someone writes the dictionary goal.

    3. I am definitely of the opinion that pants should work out of the box with an empty pants.ini. The new options system was designed with this goal in mind. Everything can and should have a sensible default in the option registration.

    4. removed

  3. Nope - this is detritus to sweep away.

    1. related to the classpath expansion section note I have above.

    2. Aha yes - I was wrong. We do want to sweep away confs, but this code does in fact need them to access the classpath of the code being checked by checkstyle.

  4. The target logging here and the source logging below reresent a lot of logging and that logging is redundant to information in the build server UI which lists both targets expandable to sources as well as the cmdline which has sources. Can this all be killed in favor of the existing logging?

  5. I think you already touch all % string interpolation formatting lines in this file - as such it would be good to -> {} / .format style interpolation.

  6. 
      
JI
JI
JS
  1. Thanks a bunch for the test.

  2. This still needs to be reverted - the resolve is needed for the code-being-checked classpath (for the few checkstyle rules that need a classpath). The gen is needed because it can contribute to the resolve classpath. In short - those deps are what provide egroups hits.

  3. alpha - when doing import ordering, ignore the leading import or from.

  4. Even though it works - its odd to read: s|src/java/|src/java|

  5. 
      
JI
DT
  1. 
      
  2. I think it's OK to hard-code ".java". If you wanted to make this more general, then just call it _SOURCE_EXTENSION, so that subclasses could do some other language.

    1. oh, there were two places hardcoded '.java' so i created a constant for it. it wasn't meant for general purpose use and i had no intention of subclassing neither.

    2. From my perspective (and maybe I'm wrong), it's OK to have two places that say ".java", because it's shorter and easier to understand than a constant. Constants are good for things that might change (".java" -> ".scala"); things people won't remember (3.1415926535897932 vs PI); and things people won't understand without context (7 vs DAYS_IN_WEEK). But for cases like this, I would prefer to skip the constant.

      This is totally a nit, tho, and it's OK to ship as-is.

    3. I'm actually just going to merge it as-is. Please mark this as submitted.

  3. 
      
DT
  1. Ship It!

  2. 
      
IT
  1. Ship It!

  2. 
      
JI
Review request changed

Status: Closed (submitted)

Loading...