#!/usr/bin/env python
#-*- coding: utf-8 -*-
u = u'moçambique'
print u.encode("utf-8")
print u
chmod +x test.py
./test.py
moçambique
moçambique
./test.py > output.txt
Traceback (most recent call last):
File "./test.py", line 5, in <module>
print u
UnicodeEncodeError: 'ascii' codec can't encode character
u'\xe7' in position 2: ordinal not in range(128)
import sys
if (sys.stdout.encoding is None):
print >> sys.stderr, "please set python env PYTHONIOENCODING=UTF-8, example: export PYTHONIOENCODING=UTF-8, when write to stdout."
exit(1)
(This is IPython shell)
In [1]: import sys
In [2]: sys.stdout
Out[2]: <colorama.ansitowin32.StreamWrapper at 0x3a2aac8>
In [3]: reload(sys)
<module 'sys' (built-in)>
In [4]: sys.stdout
Out[4]: <open file '<stdout>', mode 'w' at 0x00000000022E20C0>
In [11]: import IPython.terminal
In [14]: IPython.terminal.interactiveshell.sys.stdout
Out[14]: <colorama.ansitowin32.StreamWrapper at 0x3a9aac8>
There may be some code that relies on the UnicodeError being thrown for non-ASCII input, or does the transcoding with an error handler, which now produces an unexpected result. And since all code is tested with the default setting, you're strictly on "unsupported" territory here, and no-one gives you guarantees about how their code will behave.
Again, the worst thing is you will never know that because the conversion is implicit -- you don't really know when and where it happens. (Python Zen, koan 2 ahoy!) You will never know why (and if) your code works on one system and breaks on another. (Or better yet, works in IDE and breaks in console.)