Fix problems with unicode in junit XML output when writing to HTML report

Review Request #4051 — Created July 6, 2016 and submitted

zundel
pants
zundel/unicode-in-html-report
3627, 3628
cf4019b...
pants-reviews
gmalmquist, molsen, stuhood

When there is unicode in the XML output, using --test-junit-html-report would result in an exception similar to:

File "/Users/zundel/Src/Pants/src/python/pants/backend/jvm/tasks/junit_run.py", line 568, in _execute
    _do_report(exception=None)
  File "/Users/zundel/Src/Pants/src/python/pants/backend/jvm/tasks/junit_run.py", line 562, in _do_report
    html_file_path = JUnitHtmlReport().report(self.workdir, os.path.join(self.workdir, 'reports'))
  File "/Users/zundel/Src/Pants/src/python/pants/backend/jvm/tasks/reports/junit_html_report.py", line 104, in report
    fp.write(self.generate_html(testsuites))

Exception message: 'ascii' codec can't encode character u'\xe5' in position 3508: ordinal not in range(128)

This change converts the encoding to binary before writing it to the file.

Added a unit test that reproduces the problem.

CI green at https://travis-ci.org/pantsbuild/pants/builds/143000429

ZU
CH
  1. 
      
  2. This seems like you would want ensure_text? I guess I'm not clear on what ensure_binary is doing here.

    It seems like the html generator should just be encoding the utf-8 characters in the html when it generates the html.

    1. That's what I thought at first too! I'm pretty sure its the write() command that can't handle utf-8 and so this just changes it into a buffer full of bytes.

    2. Oh, ok. So the html is correct? But writing it to the file was failing. Got it.

  3. 
      
CH
  1. Ship It!
  2. 
      
NH
  1. Hooray for better unicode support!

  2. src/python/pants/util/strutil.py (Diff revision 2)
     
     

    While you're in here, could you add a test for the error case?

  3. tests/python/pants_test/backend/jvm/tasks/reports/junit_html_report_resources/TEST-org.pantsbuild.UnicodeCharsTest.xml (Diff revision 2)
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

    Are the properties necessary? I think it'd be clearer without them.

  4. Perhaps rather than asserting the stats, you could assert about the unicode contents? Or do we not handle unicode in source files? Now that would be 悲しい。

  5. 
      
ZU
ZU
Review request changed

Status: Closed (submitted)

Change Summary:

Thanks Chris & Nick. Commit 534eb57.

Loading...