I do a lot of quick and dirty Python scripts. One thing that bites me is trying to print useful data on exceptions, where the useful data contains non-ASCII and I’m trying to write it to sys.stderr and I get an exception in my exception handler, killing my program:
try: process(inputString) except: sys.stderr.writeln("Error with input %s" % inputString) UnicodeEncodeError: 'ascii' codec can't encode character u'\x81' in position 0: ordinal not in range(128)
Very irritating. One workaround is to use Python’s logging package. It’s a hugely complex configurable beast, but the default config is useful: print all warnings and above to stderr. And it handles Unicode sensibly.
>>> logging.warning('u"\x81"') WARNING:root:u"▒" >>> logging.warning('u"\u2022"') WARNING:root:u"\u2022"
Not sure what it’s doing; maybe trying Latin-1 and escaping anything else? I don’t care if the output is mangled as long as it doesn’t kill my program.