Add a test for the Git changelog.

Review Request #1792 — Created Feb. 18, 2015 and submitted

dturner-tw, jinfeng, zundel
This test confirms that a log with commits of mixed encodings with
non-ascii, non-utf-8 characters is properly decoded.

 src/python/pants/scm/BUILD              |   1 -
 src/python/pants/scm/             |  28 +++++++-------
 tests/python/pants_test/scm/ | 104 +++++++++++++++++++++++++++++++++-----------------
 3 files changed, 82 insertions(+), 51 deletions(-)
CI went green here:
  1. Thanks for working on this.

  2. src/python/pants/scm/ (Diff revision 1)

    Is this just a stylistic change or does python do something different when you omit the parens when returning a tuple?

    1. Stylistic - I cleaned up pylints in this file.
  3. src/python/pants/scm/ (Diff revision 1)

    Using empty braces without a label is new to me, I think I'll start using it more in the future. I didn't see the empty braces format described in or the python 2 docs, but found it in python 3 docs. I see that it works for me in my python 2.7 installation.

    1. Yes - I found out about it when Wickman's use in pex was reported broken for python2.6 [1] - which pex explicitly supports and we explicitly do not.
  4. src/python/pants/scm/ (Diff revision 1)

    nit: For consistency, I would prefer going with our convention of declaring

    logger = logging.getLogger(__name__)

    at the top of the file and converting all of the references to self._log to use the global logger instead.

    1. I was just linting here and not trying to change behavior.  I'd love to just keep the current use passed in logger or create a local one if thats OK with you.  Alternately, I could revert all my linting of this file.  Killing the log arg to the Git constructor is an API break and would have to go through a deprecation cycle - I would not be surprised for example if Twitter has a custom Task that passes self.context.log (I hazily recall this).
    2. should have read more closely, I missed the redirection to self.context.log 2 lines above. its fine.

  5. tests/python/pants_test/scm/ (Diff revision 1)

    NB: I looked at the implementation of environment_as(). I see it uses setenv() which I try to avoid because it can introduce memory leaks. I suppose its not much of a risk and is a lot more convenient than tacking on an environment to every subprocess call.

    1. Thanks for the heads-up.  I did not know / remember that note in the pydoc - that posix api is ... crazy.  I'm just sticking with the style here - a global change or series of them to switch from environment_as to passing subprocess an env is definitely out of scope.
  6. tests/python/pants_test/scm/ (Diff revision 1)

    nit: Add a comment that you are using the copyright symbol. I initially mistook it for @ and some editors/text utils will display it as looking like garbage in the string

  1. Ship It!
  1. This test is fine, but (to copy my remarks over from the other RB), the case I'm worried about is which I don't think this addresses.

    1. I think I have that addressed now, and Noralf Trønnes is further immortalized.
      I added a test case that did in fact trigger UTF-8 decoding errors and then fixed by plumbing a decode error policy of 'replace' for the changelog output.
    2. Awesome!

  1. Thanks folks - submitted @
Review request changed

Status: Closed (submitted)