Bashing Unicode in Python

My whole work environment is UTF-8. Except Python. Python’s “print” will encode in UTF-8 if its printing directly to a tty. But as soon as you pipe that tty to something Python reverts to ASCII. And it’s a real PITA to overcome that.

Here’s one ugly kludge to make Python always treat stdout as UTF-8. It causes problems with some tools, but is sufficient for hacking around:

import sys, codecs
sys.stdout = codecs.getwriter('utf8')(sys.stdout)

2 thoughts on "Bashing Unicode in Python

  1. Setting the PYTHONIOENCODING environment variable to “utf-8” works for me.

  2. Hey Ben! Thanks, yeah, I’m considering doing that. But then it only works in my environment; would rather bake the setting into the script. Mostly it just frustrates me that it’s 2012 and this is still a challenge; UTF-8 should be the default.

