Add Scalafmt Support to Pants

Review Request #4216 — Created Sept. 9, 2016 and submitted

ity, kwlzn, stuhood

Integrating Scalafmt Style Checks into Pants.

The test Target now checks if .scala files conform to the configured Scalafmt settings. If not the Target run fails.
- Added a ScalaFmt Task, that checks if all .scala files in the target match the Scalafmt config
- Registered ScalaFmt Task to run as part of the test Target
- Added Configuration to pants.ini to read config and to disable the Scalafmt check. It is currently disabled.
- Added a default Scalafmt config
- Added Scalafmt Integration Test Cases

Github Pull Request
All tests and CI passes

  • 0
  • 0
  • 9
  • 2
  • 11
Description From Last Updated
  1. Hi Catie, welcome!

    I was doing some Pants code reviews and I noticed that the 'people' field was left blank on this CR. ReviewBoard can blackhole submissions that don't fill out that field, and won't email the list until a committer leaves a comment. I left a drive-by review, hopefully that will trigger the emails. You may be best served by following up with some names just in case.

    I found mostly nits, but I do have some suggestions about the task registration. Thanks for implementating this- I work at a Scala shop and will be interested in passing this over our repo.

    1. Great thanks added the folks at twitter I was consulting with during HackWeek as well!

    2. Sorry for the long delay in responding! RBcommons doesn't appear to have sent me an email for your reponse, Stu had to ping me.

      I left the open issue about putting it in the 'test' goal. You can 'Drop' that one if you find yourself unconvinced, or just want to hold off until the followup.

    3. sounds like this might be the age-old "RBCommons can't send email on behalf of because of security settings" issue.

      @Caitie I think you might need to either use a personal email address or sign up for a email in your rbcommons profile to get around this. admittedly, it's a slight stumbling block for new twtr-based contributors.

    4. @Kris: As of a few days ago that issue should be fixed on rbcommons side: they're now respecting DMARC rules. So moving forward it should be a non issue.

    5. oh, nice!

  2. Is there a reason to schedule this task to run during the 'test' phase? I thought I heard rumblings about a 'fmt' goal, to match gofmt.

    Installing this into test.scalafmt will result in some fingerprint churn.

    For example, the 'test' goal depends on 'compile', so a full compile phase would happen before scalafmt runs. That would mean a misformatted source file would be compiled once before the scalafmt failure and then must be recompiled a second time after it was fixed.

    Unfortunately, simple installation into fmt doesn't appear to provide enough guarantees about the task ordering. My suggestion below is to use Products registration and delegate task ordering to the round engine.

    1. This task just checks that the code conforms to scalafmt style. It doesn't actually format the code. Its essentially a test case that passes when the code is formatted correctly and fails when it does not.

      The thoughts were we could add actually formating the code to fmt.

    2. Okay. I saw what the code did but was unaware you had a followup planned. Adding to test as a temporary landing spot is okay with me, I guess.

      But for the record, installing it in test has the same problem, but worse. Because if you fail the "test case" and run scalafmt (manually or however) you will have to wait for both compile and test a second time.

      Considering that the task operates over all the targets in context, and imagine that I enabled scalafmt in my pants.ini.
      As I went about my day developing and testing - I would have to compile and test everything twice, for every different target I ran test on.

      So if foo, bar, and baz are misformatted test targets that depend on misformatted src targets, the work required is roughly double:

      1. ./pants test foo # runs a full compile and test
      2. scalafmt -i foo.closure().sources # ~roughly the command printed by the task currently
      3. ./pants test foo # The src files are changed - so another full compile and test
      4. do the same double compile/test for bar/baz.

      If the task is scheduled before compile then this problem goes away.

      Like I say above, if you would like to temporarily land in test, I suppose that is fine. But I don't see why that is better than landing it in fmt, and changing the skip option to dry-run or something.

    3. Thanks for the explanation Mateo! Updating to run as part of the compile target.

  3. Looks like you are missing some BUILD dependencies here.

    1. You are still missing the jvm tasks and jvm target BUILD deps.

  4. nit: the copyright year needs to be updated on all of of the new files in this patch

  5. Installing the task into a 'goal' in is one way to force scheduling of a task. But the round engine is a more flexible option, it dynamically schedules tasks based on the Product dependencies.

    I suggest some product types here, but it does rely on a bit of a round engine hack.

    It will force this task to be scheduled before the expensive resolve and compile tasks. The SimpleCodegenTask base class uses this same tactic, IIRC.

    task(name='scalafmt', action=ScalaFmt, serialize=False).install('fmt')

    def prepare(cls, options, round_manager):
      super(ScalaFmt, cls).prepare(options, round_manager)
    def product_types(cls):
      return ['scala']

    This would allow iterating on fmt errors without ever waiting through a compile phase. It also has the added benefit of accurately representing the involved products - scala code in, and scala code out :)

    1. So when we implement the actual formatting this seems reasonable. However right now its just essentially a test case check to say whether it passes scalafmt or not.

    2. Okay, I will drop this since you are planning a followup.

  6. I don't think you need to mix 'skip' into the fingerprint, whether or not you intend to add fingerprinting to this task. The fingerprint is calculated when a target is passed to a task's invalidation block. Considering this is set to early return on 'skip', there should be no way to include --skip value in a target fingerprint.

    1. This is still live. I am not going to reopen the issue, but it should be removed, imo.

  7. Having fingerprint=True here is probably for the best, though. Any future fingerprint strategy would want to invalidate for a new config file, so I consider it is fine to declare that in advance of actually implementing the fingerpinting.

    1. cool will leave in. Once again was modeling this off of ScalaStyle task

  8. You could use ','.join(sources) here, to save yourself the details. It also has the benefit of not creating all the intermediate lists.

  9. nit: Some variable names are camelCase, the python style is to use snake_case.

  10. I am not sure that asking for an out-of-band scalafmt run on failure is going to work, at least not quite as-is.

    One problem is that there isn't a guarantee that the user will be able to successfully run the output scalafmt command. Pants does bootstrap the scalafmt tool but that tool won't be made available to the user's shell. It's bootstrapped to the ~/.cache/pants dir and not surfaced to the path. Also, brew install olafurpg/scalafmt/scalafmt failed on my Mac when I tried it just now.

    My preference for handling misformatted files would probably be to register an option that would determine whether to run with -i or --test. That would delegate all of the tool invocations to Pants itself.

    That could probably work with minimal code changes but might also result in users being asked to commit fmt fixes for targets they never personally changed.

    Another option would be to just default to editing in place. It looks like you saw the isort task, that task has edit in-place as always on, so users have the choice of either skip or edit with no report-failures mode available.

    1. So I talked about this with Pants folks at Twitter. We don't want test or compile targets to modify files, could run into weird scenarios if you are using IntelliJ pants and/or scalafmt plugin as well.

      There are a bunch of options for distributing Scalafmt if people want to use it before the fmt command is implemented in pants. We talked about doing this via MDE at Twitter.

      Actually running Scalafmt via the fmt command is a next task. This is similair to the Golang gofmt development pass. So users have to directly invoke it. This is just step one, that's why its off by default for now.

    2. I think "We don't want test or compile targets to modify files", you mean s/targets/tasks/. In which case, I absolutely agree. This comment was predicated on my original recommendation to move the task out of the 'test' goal and into fmt.

      I am sure I could install scalafmt, I mostly meant that it isn't useful to print a command that a subset of users may not be able to easily run - better to just print the results of the test and let them take it or leave it.

      It is unclear to me if you are referencing the upstream Go fmt or the GoCheckstyle. But if it is the latter, that task's output points to the bootstrapped go distribution.

      A reasonable compromise for the scalafmt command to reference the bootstrapped scalafmt binary under the cachedir instead of assuming that it is already on the path. But I favor just removing the cmd until the fmt install is ready to ship.

    3. IIRC, scalastyle in the pants repo is installed in compile (see). Internally at Twitter move it to the beginning of the test goal.

      @caitie: To match the convention of the pantsbuild/pants repo, this should probably be installed in compile.

    4. I am also okay hashing this out in the next CR, to be clear, since it is non-mutating.

    5. Moved to run as part of compile. Will have another PR with the fmt command integration, to ensure the same binary is executing etc...

  11. errant whitespace here and on line 70

  12. I believe Pants has a preference of using for comprehensions in place of filter/lambda. I at least remember one push to replace them, at least : )

    Up to you, just an FYI.

    1. I lifted this code from scalastyle so i'll just leave as is which is listed in the pants documentation as the example task to follow.

  13. Using Target as a filter type seems a little broad to me. IMO, if a file isn't marked for compilation, it probably can skip fmt.

    Is there a subtle reason to not use ScalaLibrary? That change would also helpfully limit the number of targets that must be scanned. For example, using Target would force the file extension check on every PythonLibrary source file in context.

    And a bonus benefit to that change - maybe you wouldn't need to force skip in pants.ini. I worry that 'skip' will get accidentally cargo culted by users.

    1. Not sure, I grabbed this from the scalastyle task. Since Ideally Scalafmt will replace it.

    2. I did not reopen the issue since the task is disabled. But this really should be fixed before this task is enabled anywhere. It should be limited to Scala targets since those are the only supported target types. Looks like Stu has an example snippet below.

  14. The default timeout is 60s - is the bump to allow for slowness bootstrapping the tool?

    1. removed to revert to default for HackWeek I was running more involved tests just for examples that needed the additional time.

  15. nit: both comments are missing periods and a leading space:

    # Test should fail because of style error.

    although in both cases you might consider deleting the comment outright.

  1. Was the 2nd revision posted off a wrong parent? Some code seems to be missing.

    1. Yes sorry! Fixed. I've been having a bunch of issues with rbcommons

  1. Thanks Caitie!

  2. You can directly pass a predicate to self.context.targets(predicate=...).

  3. Let's leave out the comment about manually running the tool... I think the next step/review will be to install this same tool in the other goal in "--fix" mode.

  2. Can you add a TODO here to respect target_roots?

    If I add a Scala test, and run ./pants test.fmt test/jvm/org/pantsbuild/service/test, I may not want the dependencies in src/ checked.

    It is fine as a TODO for now, imo but is also probably a blocker for shipping the actual formatting.

    1. scalastyle checks transitively (and should, IMO)... and this is a style check.

      But I agree that we should make the followup "mutate my sources" task match the semantics of src/python/pants/backend/python/tasks/

    2. Hmm...I would think that it should either:
      a) operate on all scala sources (not using targets in context at all)
      b) respect the targets asked for on the command line.

      I would be surprised to run ./pants fmt.scala src/scala/org/pants/target/bin:bin and have dependent targets get mutated.

      Although, perhaps I still misunderstand the intended next steps. Reading your comment, will the "mutating" task be a separate task? If the task that actually changes files follows either of the two options above, I am happy. And possibly you could convince me that transitive fmt is the correct thing to do.

    3. Yes, it will be a separate task.

  3. I think you should add the checked in scalafmt config as the default arg to this option, since it is checked in to the repo.

    If there are reasons why that wouldn't work - I did not read the config file : ) - then you will need a None check here, since if self.get_options().configuration was unset you would be passing None in the args list passed to runjava()

    1. Agreed about the None check. You might even consider supporting an "empty" config for the None value.

      Re: the default: there is no reasonable default value for the config location probably, because we haven't really set a convention for where additional config goes.

    2. That sounds fine to me. At some early point, I added pants_configdir:

        def get_pants_configdir():
          """Return the pants global config directory."""
          # Follow the unix XDB base spec:
          config_home = os.environ.get('XDG_CONFIG_HOME')
          if not config_home:
            config_home = '~/.config'
          return os.path.expanduser(os.path.join(config_home, 'pants'))

      but I think only the android backend is using it as of now. If I was building that today, I might want it under the buildroot. Anyway, whichever you prefer is fine with me.

    3. This is a great point! Scalafmt has a built in default config if you don't specify --config flag. So I added a None check. If a config is not specified then the --config flag and path are not added to the command to test or the fix command

  1. Thanks for bearing with me on this Catie, looks great. There are a couple deviations from python style, but I consider it ready to land.

    1. Thanks for all the feedback and help!

  2. src/python/pants/backend/jvm/ (Diff revisions 3 - 4)

    Nice! I didn't know about first=True. This solves the issue nicely.

  1. Ship It!
Review request changed

Status: Closed (submitted)

Change Summary:

Merged as 8ba7de4388088573ba5fee04fee09472e8c92c09