Fixing cobertura coverage so that it actually works.

Review Request #1704 — Created Feb. 3, 2015 and submitted

benjyw, ity, nhoward_tw

Changes in this commit:

- The three Cobertura phases are now invoked with different classpaths. This fixes problems where
jars in the full cobertura dependencies were conflicting all the time with jars in the run phase,
and publishing a shaded jar was too much of a hack for my tastes. This also allows for a different
reporting tool to be substituted for the reporting phase of cobertura.

- Three separate tools for the three stages of cobertura, as they need different classpaths and
different dependencies, to accomodate the previous point.

- Various fixes in how cobertura was actually invoked, in particular, the --auxclasspath option in
the instrumentation phase was crucial in getting reliable results.

- Miscellaneous code cleanups (see diffs)

travis-ci passed for

  • 0
  • 0
  • 11
  • 2
  • 13
Description From Last Updated
  2. how about --coverage-force? -when-tests-fail sounds like you only want coverage when tests have failed.

  3. you should use os.pathsep here to allow for the possibility of windows runs

    1. I learned something today :)

      I didn't even know java on windows used the windows PATH separator; they sure don't use the windows options character ('/'). I don't have a windows machine, so I'll take your word for it :)

  4. use .format instead of %

  1. Generally looks fine to me, but I'd like someone at Twitter to sign off on this, since we don't use this at Foursquare. And please comment the BUILD file...

  2. (Diff revision 1)

    Three different ways to bring in the same JAR? This definitely requires some comments.

  2. TODO(x) is a common way to search for real todos. Please stick to TODO, XXX, or NOTE depending on the context.

  3. .format is always preferred now.

  4. What's with the %?

  5. This shouldn't be wrapped in file, unless the interface of temporary_file is really weird. instrumented_classes_file should be a file object that is writable.

    Also, this would need to be 'wb' and explicitly encoded as utf-8: this would fail on classes that contain non-ascii characters. And they, unfortunately, do exist.

    1. Quick note: I know the RBC UI doesn't make it easy when looking through inline comments, but it's discouraged to reply to comments from within the diff UI's little dialog boxes. Instead of doing the natural thing and posting back to the thread of discussion, it just opens up a brand new top level reply. Generally we hop back to the "view reviews" UI to reply to comments and questions inline. I always keep two tabs open for this purpose.

    2. you mean like this? :)

  2. well, it was a reply to benjy's todo.

  3. preferred for everything, or just for logging (where we don't want to worry about a seldom-exercised line suddenly failing because of an incorrect format conversion?).
    1. For everything.

    2. There is always the argument of "don't change things gratuitously" vs. the Boy Scout rule ("leave the campsite cleaner than you found it"). I'm all for the latter. Is there a consensus in this group on what is preferred? if the latter, I'll fix these things every time I touch a file.

    3. I think there's consensus of boy scout rule for the lines you touch, not for the files you touch.  I tend to go boyscout on the whole file and often get called out for this, which I can't argue against and so generally relent.
    4. I do think that we prefer the former, actually. It makes for cleaner reviews and easier revertability. In this case, I couldn't see any other surrounding interpolation for context, so it just looked like new code to me. New code I think should always use the new way--so I'm not entirely in the "be consistent with context" camp here, since we're trying to push in a specific direction.

      But at the same time, we don't go through and change a bunch of interpolations within an unrelated review. However doing so in a separate review is always welcome!

    5. I didn't hear anything, so I just went in and fixed everything.

  4. it's there to see if you were paying attention :)

  5. The original (incorrect) reason for using temporary_file_path() was that I needed the filename after the file was closed (I incorrectly assumed that the file would disappear when closed, as opposed to when the context manager would exit.

    however, if you want it opened 'rb', (and to open it as utf8 it would have to go through, as the file ctor does not take an encoding), it has to go this route again; temporary_file does not take any additional keyword arguments, nor does it have provisions for opening with a codec.

    Moreover, I tried creating a java class with a greek name, put it in an identically-named file, and kept getting errors like this:

    CacheValidationError: Problem validating target in sandbox/users/ji/cover/src/main/java/com/twitter/j5: 'ascii' codec can't encode characters in position 54-63: ordinal not in range(128)

    it's not that the BUILD file is not somehow saved as utf8, I worked around that by using an ascii character as the first name of the file and using glob() in the sources parameter, so there is no utf8 processing when the BUILD file is read.

    Is this a deeper bug? DO you have an example with non-ascii filenames and classnames? the cucumber example in the tests doesn't count!

    1. Ah, I missed the _path suffix. In that case, you should be using another context manager to open (and implicitly close) the file object. It's extremely rare to use the file constructor without using it as a context manager, and in a case like this where the file has a nice well defined lifecycle, it should always be managed.

      You don't need to go through codecs to make sure it's utf-8 encoded: you can just open wb and then encode what you write as appropriate. Though using codecs is also fine, I suppose. But it wouldn't be consistent with what we do everywhere else for managing utf-8 encoded files.

      An isolated repro of the bug you're seeing with non-ascii Java classes would be much appreciated, that's probably an independent bug, and we should be testing that and guarding against regression. If you could put that in a separate CR and include a test marked xfail, I (or others) will investigate.

      I can't immediately think of code with non-ascii file/classnames, but we've certainly run into classfiles from third party jars with goofy names (shapeless). Regardless, if it's allowed, someone will abuse it--especially in scala land.

  6. so this should be something like:
    with temporary_file_path() as icf:, mode='w', encoding='utf-8').write(u'\n'.join(classes) + u'\n')
    is @classes a list of unicode strings?
    (wb is unnecessary with codecs, b is implied)
    1. It's unclear just based on eyeballing what classes is here, but it should be a sequence of unicode strings, yes. If there's uncertainty or ambiguity, we have the helper src/python/pants/util/ and its inverse.

    2. I filed and added it as a TODO.
      Testing with coverage (but without the suggested change above) of the only other unicode-related test I could find:
      PANTS_DEV=1 ./pants test.junit --coverage --coverage-processor=cobertura --coverage-patterns='com.pants.*' --coverage-xml --coverage-html testprojects/tests/java/com/pants/testproject/unicode:: -ldebug
      works and produces results, but it is not really testing unicode classnames that exist in files already.
  2. Missing an os.path import

    1. you don't need it. import os subsumes it.

    2. That is incorrect--import os only happens to work because import os.path has happened at some point in module loading before this point. It is fragile to not import what you actually use.

    3. Interesting--I just double checked and I appear to be wrong here. I know it's good style to import os.path if you're using it, but I'm surprised that I can't repro the error I expected in a REPL.

    4. Wrong:

      on linux

      $ python
      Python 2.7.3 (default, Mar 13 2014, 11:03:55)
      [GCC 4.7.2] on linux2
      Type "help", "copyright", "credits" or "license" for more information.

      import os.path

      Instead of importing this module directly, import os and refer to
      this module as os.path. The "os.path" name is an alias for this
      module on Posix systems; on other systems (e.g. Mac, Windows),
      os.path provides the same operations in a manner specific to that
      platform, and is an alias to another module (e.g. macpath, ntpath).

      Some of this can actually be useful on non-Posix systems too, e.g.
      for manipulation of the pathname component of URLs.

      on macos:

      $ python
      Python 2.7.8 (default, Aug 7 2014, 00:30:20)
      [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
      Type "help", "copyright", "credits" or "license" for more information.

      import os.path
      (same output)

      on windows:
      Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32
      Type "copyright", "credits" or "license()" for more information.

      import os.path
      Help on module ntpath:

      ntpath - Common pathname manipulations, WindowsNT/95 version.


      Instead of importing this module directly, import os and refer to this
      module as os.path.

    5. packet collision :) (look at our timestamps)

  1. One last nit: could you change the title of this CR to reflect the overall nature of the commit and leave the historical stuff in the details section? I'm not sure if you can change RBC titles once they're out, but double check the commit title generated by ./rbt patch before upstreaming please.

  1. Ship It!
  1. Upstream at 9f9c1a951607e1b222402194ad817e90169cd08d; please mark as submitted.

Review request changed

Status: Closed (submitted)

Change Summary:

Upstream at 9f9c1a951607e1b222402194ad817e90169cd08d